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ABSTRACT 


This research is concerned with the asymptotic properties of feedback sys 
terns containing uncertain parameters and subjected to stochastic pertur- 
bations. The approach is functional analytic in flavor and thereby avoids 
the use of Markov techniques and auxiliary Lyapunov functionals character 
istic of the existing work in this area. The results are given for the 
probability distributions of the accessible signals in the system and are 
proved using the Prohorov theory of the convergence of measures and some 
recent work on the preservation of convergence under operations. For gen- 
eral nonlinear systems a result similar to the Small Loop-Gain Theorem of 
deterministic stability theory is given that is sufficient to guarantee 
that totally bounded stochastic Inputs give rise to totally bounded out- 
puts. Here boundedness is a property of the Induced distributions of the 
signals and not the usual notion of boundedness in norm. For the special 
class of feedback systems formed by the cascade of a white noise, a sec- 
tor nonlinearity, and a convolution operator conditions are given to in- 
sure the total boundedness of the overall feedback system. These condi- 
tions are expressed in terms of the Fourier transform of the convolution 
kernel, the sector parameters of the nonlinearity, and the mean and the 
variance parameters of the noise. Their form is reminiscient of the fam- 
iliar Nyquist Criterion and the Circle Theorem for deterministic systems. 
Applications of the criteria to analyze rounding errors in machine com- 
putations and to study control systems containing human operators are 
suggested. 
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CHAPTER 1 
INTRODUCTION 

1 . 1 Stability of Dynamical Systems : 

The study of dynamical systems has evolved along two paths essentially 
distinct in mathematical formulation. The first, which is based in the 
theory of differential equations, uses the concept of a dynamical system 
as a semigroup of states and thus has an algebraic flavor. For autonomous 
systems (no forcing function) this approach was already weli formulated 
fifty years ago [8]. For physical systems accurately described by a finite 
dimensional set of states which have interpertations as physical variables 
(electrical voltages and currents, for example) powerful and precise con- 
clusions may be drawn about the properties of the system. However, when 
the physical system admits no accurate finite dimensional model, the 
general state theory is at this time rather formal and, except in specific 
cases, the precision attained in the finite dimensional case is lost in 
technical difficulties. 

The use of dynamical systems as models for control processes has led 
to a second method of analysis based simply on the input-output properties 
of the sy stem. In this formulation the input and output of a system are 
considered as points in a set of functions and the system itself as an 
operator on this function space. Thus, functional analysis replaces the 
theory of differential equations as the source of analytic tools. Problems 
associated with selecting a suitable representation for the internal 
structure of a dynamical element are avoided and large classes of complex 
systems may be treated qualitatively with simple techniques. 
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Originating only within the past decade, the operator theoretic 
treatment of systems has been developed only for the easiest problem 
associated with feedback systems-stability. Restricting the set of inputs 
to be perturbations of the system, that is bounded in some sense, a sys- 
tem is defined to be input-output stable if bounded inputs are mapped 
into bounded outputs or equivalently if the system is represented as a 
bounded operator. In this context boundedness of a signal may mean the 
usual boundedness in amplitude or in some more sophisticated sense such 
as total energy or power. In the state theory stability is defined as 
asymptotic convergence of the system state to the zero state. Perturbations 
are introducted by initial displacements of the state from zero. For those 
systems permitting a simple state representation it is usually easy to 
commute between the concepts of input-output stability and state stability 
[ 63 ]. 

Stability theory in the state space setting relies on the use of 
Lyapunov functionals, certain auxiliary functions of the state. These 
functionals completely specify the asymptotic behavior of the state when 
they can be found and determined to be positive definite and have negative 
definite time derivatives in a neighborhood of an equilibrium state. As 
there at present exists no constructive method of generating Lyapunov 
functions, the general theory remains in a static condition at present. 

By contrast the operator theoretic approach to stability casts the 
problem into a very active area of mathematical research-the invertibility 
of operators. To see that this is the case, consider the equation 
x + KGx = u 

as the description of a feedback system. Here K and G represent generally 
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nonlinear elements in the feedback loop , u is a perturbation input , and 
the output x is to be studied. If u is an element of some normed, linear 
space of functions, then x is bounded (an element of that space) if 
I + KG has a bounded inverse on that space. Hence, the stability question 
may be resolved using the mathematical theories relating to the invert- 
ibility of operators on normed dr metric spaces. Indeed many new as well 
as some familiar results have been developed using spectral theory and 
Banach algebras, two of the basic tools in invertibility studies. 

It is the presence of an active and well-founded theory for the anal- 
ysis of deterministic systems in operator form that motivates this research 
which attempts to extend the theory in such a way as to preserve its 
essential elements and yet account for stochastic signals and uncertain 
parameters in the analysis. 

1.2 Stochastic Systems ; 

Efforts to model increasingly complex control systems have led to the 
study of some systems which simply cannot be modelled accurately with 
perfect certainty. Uncertainties are introduced either by phenomena that 
are so complicated as to defy reduction to a tractable deterministic 
model or are in essence random. As an example of the former consider the 
generation of roundoff errors in a digital computation. Restricted by 
finite register size the machine must of necessity round-off stored var- 
iables at each stage in a computation. Being a design choice the rounding 
mechanism is not uncertain, and in any given computation of limited com- 
plexity the rounding errors could be monitored exactly. However, in a 
computation of even moderate complexity the register size will be exceeded 
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at many points in the calculation and monitoring the errors may become 
a more formidable task than the original computation. In such a case it 
is reasonable to assume that the evolution of rounding errors is a statis- 
tical process in order to appraise their average magnitude. 

As an example of the introduction of essentially random phenomena into 
a physical experiment consider the problem of maintaining the orientation 
of a rigid body in orbit around the earth. Primary sources of error are 
sensor errors and propulsion jat errors (in firing and cutoff times). A 
secondary source of error .but a very important one in very precise appli- 
cations, is the fluctuations in the earth's gravitational field along the 
path of the orbit due to surface irregularities and local variations in 
the density of the earth. Because the sensor errors make an exact deter- 
mination of position impossible no model apart from a statistical one can 
accurately (within the usually rigid specifications of these experiments) 
account for other than the most prominant aberrations. This problem reduces 
to design of a feedback control law capable of precisely orienting a 
satellite in the presence of essentially random perturbations . Moreover , 
the controllers (combining sensors and propulsion units) are themselves 
subject to stochastic errors that cannot be deterministically approximated 
within the tolerances fixed for these projects. It is therefore appro- 
priate in a general analysis of systems subjected to uncertainties to con- 
sider not only random external perturbations but to permit random parameter 
variations as well. 

One of the major problems faced at the outset of an analysis of a 
stochastic system is to determine accurate probability distributions for 
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the quantities considered as random in the experiment. In general some 
method of hypothesis testing must be applied to the available data and 
distributions deduced from this procedure. Although the possibility of 
several empirical distributions is permitted in the definitions of a 
stochastic system in section 3,1 below, in the major portions of the 
analysis to follow it is assumed that the process of likelihood test- 
ing has been completed and that an optimal distribution has been selected. 
For an interesting and Important alternate approach for optimal control 
problems see the papers and thesis of Witsenhausen [67], [68], and [69]. 

Following the pattern observed in deterministic systems theory, the 
first problem to be considered for stochastic systems was stability. More- 
over, the framework was that of a state space formulation using Lyapunov 
like techniques. The reasons in both cases were compelling. First, problems 
like optimal control of stochastic systems must proceed in two intimately 
connected steps. Because the state of the system in most cases may be 
observed only in the presence of uncertainties, it must first be estimated. 
Only then may optimal controls be selected. See for instance the work of 
Kushner [48], Wonham [70], Fleming [24] , [25] , .and Benes [3], [4] for dis- 
cussions of the problems arising from restricted information on the state. 

The reason for studying stochastic systems with a state realization 
is motivated by the powerful and comprehensive mathematical apparatus 
available for the analysis of Markov processes (see for instance Dynkln 
[19]). Assuming no more than causality, any system may be shown to have a 
Markovian State decomposition (see Willems [63] for a similar theorem 
which may be easily extended to permit stochastic variables), and for those 
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systems with a finite dimensional state space the analytic theory of 
Markov processes combined with the theory of stochastic differential equa- 
tions completely determines the system behavior. Using potential functions 
of the state (like Lyapunov functionals), the stability of a stochastic 
system with finite dimensional state may be completely determined. This 
program is developed comprehensively in Kushner's book [48]. 

However, in contrast to the deterministic case- there- is a very r6al 
confusion over the meaning of stability in a stochastic system. The con- 
fusion stems largely from the numerous distinct varieties of probabilistic 
convergence available. Thus, almost sure convergence , convergence in n th 
moment, convergence in probability and others have been used to study the 
asymptotic properties of perturbed stochastic systems. However, for systems 
defined by stochastic differential equations it is straight forward to 
commute between these equations for the trajectories (samples) of the 
signals in the system and the Chapman-Kolmogorov equations for the distri- 
butions of the state and the Fokker-Planck equation for its density function 
(see Ito [40]). 

By examining the asymptotic properies of the solution of the Fokker- 
Planck equation, those of the state may be completely determined. It is 
in fact entirely appropriate to regard the density function as the state 
of the system and to describe the behavior of the system in terms of its 
evolution. In this manner Markovian stochastic systems form an important 
class of distributed parameter systems (systems whose state satisfies a 
partial differential equation)-a class somewhat more amenable to analysis 
than most because of its special nature (particularly the boundary conditions) 
and the additional interpertation afforded by probabilistic considerations 
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and the differential equation representation. Important work within this 

interpertation has been dome by Kushner [47], Dym [17], Elliot [22], Il'in 

; . i; j 1 ; 

and Khas'minskii [38] on stability and Fleming [25] on controls 

I ! • ■ ! i ■ 

In the setting provided by a state realization of a Stochastic system 

■ , i 

the natural way to examine the asymptotic properties of the state is to 

1 ‘ j ' j : . 

introduce Lyapunov functionals of the state and consider^ their properties. 

i ; ‘ j ; ' 1 ■ I 

This has been fhe approach adopted in almost all of the references men- 

i : ; " 

tioned above. Because of bertain relationships between Markov processes 

and potential theory (Meyer [51], Hunt [37], Doob [13]) which seem to 

! ; . : | ; ; j 

account for the restrictions imposed on Lyapunov functionals, the subject 

j j 

is deserving of further study. For example stochastic Lyapunov functions 
■; |- ' ! ■ ' 
were observed by Kushner [49] and Bucy [10] to be positive supermartingales 

[51], However, a supermartingale is a potential subject to certain restric- 
tions [51]. See Dynkin [20] for a discussion of the position of harmonic 
functions and potentials in the analytic theory of Markov processes and 
comments on the construction of harmonic functions for a process. 

What one hopes would come of an investigation of these relationships 
is a procedure for generating Lyapunov functionals for interesting systems. 
At present the obstacle encountered in the deterministic Lyapunov theory 
is present in the stochastic setting; thatiis, there exists no systematic 
method in general of constructing the functionals. Moreover, in specific 
cases the construction process is far more difficult in the latter case 
(stochastic systems) because of certain technical aspects of the Markovian 
structure [48], For instance deterministic Lyapunov functions must satisfy 
a first order partial differential inequality constraining the time der- 
ivative of the functional to be negative definite. In the stochastic case 
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this inequality involves a second order operator [48,p.39]. 

Clearly an alternative approach for the analysis of the asymptotic 
properties of stochastic systems is desireable. The development of such 
an alternative along the lines of the operator theoretic stability theory 
is the subject of this research. 

Continuing the analogy with the deterministic theory it would seem 
desireable to have available a kind of "probabilistic functional analysis" 
so that the input-output results of the deterministic theory may be easily 
rederlved in a probabilistic setting. Such a mathematical theory is avail- 
able, due largely to a group of Czechoslovak mathematicians headed by 
Spacek and Hans [31], [32], [33]. The concepts of random operators equa- 
tions defined in those papers are presented here in section 2.3 and used 
in section 3.1 to prove some moment bounds for the signals in a general 
stochastic system. It is important to note that these bounds are obtained 
for signals which need not be Markov processes. 

I 

However, it is only in combination with another recent collection of 
work in the general theory of probability that this formulation of a 
stochastic system as a random operator is able to yield results in terms 
of the distributions of the processes involved. This work is concerned 
with topologies for random processes. 

Though introduced by Kolmogorov over thirty years ago, the study of the 
convergence of probability distributions has only recently returned to 
popularity. The papers of Prohorov [53] and Skorokhod [56] in 1956 were 
instrumental in generating this revival of interest. Since that time the 
study of topologies for random processes has evolved in a series of papers 
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summarized and extended in the books of Billingsly 17], Parthasarathy [52], 
and Tops^e [60] . The basic ideas are the following: for any metric space 
(X,d) let PM(X) be the set of probability measures oh X, then PM(X) may be 
regarded as a subset of the dual space of BC(X), the bounded, continuous 
functionals on X [16]. A natural weak topology is thenrinduced on PM(X) , 
and it is this topology that is suitable for determining the distributions 
of functions of a random process (see section 3.1 for further motivational 
discussions of this point and [29, Chapter IX]). 

A key point in the snalyis of convergence of distributions is a des- 
cription of the compact subsets of PM(X). Under certain conditions on the 
basic space X a set of distributions is relatively sequentially compact 
(has sequentially compact closure) if and only if there exists a compact 
subset of X on which the distributions are concentrated. That is, let 
A c PM(X) be the subset under consideration, then ^ is relatively 
compact if for every a e (0»1) there ^exists acompactsubset K(fj) of X 
such that y[K(a)] > 1 - a for every y e A • If X is a space of functions, 
suitably metrized, the result says that the distributions of the stochastic 
process taking its values in this set of functions are relatively compact 
if and only if the values of the process are in a compact set almost 
surely. Thi£ recurrence condition is familiar in ergodic theory and in a 
sense indicates the possibility of interpertations in that setting. 

By assuming X to be the space of continuous functions or piecewise 
continuous functions, the compact subsets of X may be easily characterized. 
Sufficient conditions may then be established to assure relative compact- 
ness of a set of distributions defined on X. These are summarized in sec- 
tion 2.4 for continuous functions and in section 3.3 for piecewise 
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continuous functions. These conditions are used in sections 3.2 and 3.3 
to prove that feedback systems subjected to inputs with relatively com- 
pact distributions give rise to outputs which also have relatively com- 
pact distributions. In section 3.4 these results are used to analyse the 
behavior of systems described by stochastic differential equations sub- 
jected to input processes in this class. 

Implicit in these proofs (3.2 and 3.3) and explicit in section 3.1 
is the transformation of weakly convergent sequences of distributions by 
operators. That is, a key point in the analysis is contained in the question; 
if a convergent sequence of distributions is mapped by an operator (in some 
well-defined manner) into another sequence under what conditions on the 
operator is the latter sequence convergent as well? Finding these conditions 
forms the heart of the arguments in Chapter 3. The general results that 
indicate the line of proof were developed by Blllingsly [7] and Topsde [61] 
among others. These conditions are essentially continuity of the operator 
on the underlying space X*' and in this sense relate back to the operator 
stability theory of deterministic systems where continuity of the system 
as a map on a function space is a central concept of stability. It is 
further in this way , since the feedback equation defines the variable of 
Interest implicitly, that the mathematical theory relating to invertiblllty 
of operators is once again identified as a crucial aspect of the frame- 
work for the analysis of the asymptotic properties of systems, in this 
instance stochastic in nature. 
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CHAPTER 2 

MATHEMATICAL PRELIMINARIES AND BACKGROUND MATERIAL 

2.1 Remarks and Some Notation : 

The purpose of this chapter is to recall some of the basic 
notions in the operator theoretic treatment of feedback systems 
and to summarize those aspects of the theory of the convergence 
of probability measures used in Chapter 3. Although the summaries 
here are rather concise, appropriate references are given where more 
thorough treatments may be found. As used here, only the most basic 
results from each of these theories is required and in this sense 
the background material necessary for the derivations in Chapter 3 

is minimal. The only new results in this chapter are a modification 

■ | 

of the usual definition of a random operator and a result on the 
effect of such operators on convergent probability distributions 
(section 2.4) . 

Though most of the notation and definitions from mathematics 
used here are standard, a few conventions may be unfamiliar. Symbols 
such as R = (- 00 , 00 ) , R + * [O, 00 ), and Z for the set of integers are 
standard and are freely used. The notation C(R + ;R) indicates the 
set of real-valued continuous functions on R + and is typical of the 
form used to designate function spaces. Other common notations are: 

(i) (Lp(R + ), | |* | |p) the Lebesgue spaces on R + ; 

(ii) (fi, V? , P) a basic probability space; 

(iii) (X, | | • | |) a normed, linear space; 
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(iv) (x, #(x)) a Borel measureable space, >6?(x) the Borel 
c-algebra of x; 

(v) F(A;X) the set of X-valued random variables on ft; 

(vi) PM(X) the set of probability measures on X; 

(vii) BC(X) the set of bounded, continuous (real-valued) 
functionals on X; 

(viii) jif (X) the set operators mapping X -+ X, and 

(ix) {iTj. ^teR + tlie set truncation operators on some function 
space. 

Operators on sets of functions are usually denoted by F, G, H or 
some other upper case letter. These points are representative of 
the standard conventions used here. 

As a consequence of the mixture of engineering material and 
some mathematics a few compromises in notation have been necessary. 
For example the symbol P is reserved for the basic probability 
measure on ft, and so {tt^ } is used to denote a set of truncation 
operators on functions - usually denoted by {P } in the engineering 
literature (see for instance Willems [64}). 

The terms "stochastic process," "random process," and "function 
valued random variable" are to be considered equivalent here. The 
concept of a random variable as a measureable function is used, and 
when the random variable takes its values in some set of functions, 
one of the above terms is used to indicate this case. The concept 
of a stochastic process as an "Indexed set of random variables ; " [29] 
is not used. The words distribution and probability measure are 
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used Interchangeably and should be considered equivalent. Thus, the 
more common meaning of the former term is never employed here. 

Finally, real-valued functions are used almost exclusively 
in this work, though it is acknowledged that nearly all the results 
are true for R n -valued processes. The methods used in the paper [66] 
to extend the theorems there to the multi-dimensional case may be 
applied to the results here at the cost of some complication of the 
notation. The only exception to this voluntary restriction to real- 
valued functions occurs in section 3.4 where some earlier work is sum- 
marized and compared to that given here. The concept of state is 
fundamental in the differential equation formulation used in the 
earlier work, and so multi-dimensional i^aaA£fl9&e^mu8fabetln3e*it.fD<r the state 
to thoroughly illustrate the theorems. 

2.2 Input-Output Stability of Deterministic Feedback Systems ; 

In this section a brief summary of the operator-theoretic 
analysis of feedback systems is presented. The purpose here is to 
recall a familiar class of problems and indicate an appropriate 
framework for their analysis. The concept of a feedback system as 
an operator on function spaces is introduced and stability of the 
feedback system defined in terms of the continuity properties of 
the operator. Appropriate references are the original papers of 
Sandberg [54], and Zames [74], and the papers [63] [65] and monograph 
[64] of Willems. The thesis of Davis [12] gives a rather complete 
treatment of the input-output theory of general linear systems. 
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Let X g be a vector space of V-valued functions on the set 

<|> 

R ; that is, each element x of X £ maps R into V where V is some 
given vector space. Let G be an operator mapping X g into itself 
such that GO = 0. For u e X g as an "input" consider the following 
equation 

(1) x(t) + (Gx) (t) = u(t), t £ R + , 

as descriptive of a deterministic feedback system. The operator G 
represents the cascade of all the elements of the open-loop system 
and x the "feedback error." As a model of physical elements the 
operator G must be causal and the solution x must be bounded over 
finite time intervals (bounded subsets of R ). These requirements 
are made precise using the truncation operators {ir t } defined by 

A f X < S > 8 < t + 

(TT x)(s) = < ; s, t £ R . 

l 0 s > t 

Assume that X fi is closed under these truncations. The operator G 
is causal if (pointwise) 

7T Gx = TT GTT X, t E R + , X E X . 
t t t ’ * e 

Assume that X @ has a normed subspace (X, | | * | | ) and that the truncations 

are the projections, tt ; X e -► X for every t. The existence of a 

locally bounded solution to (1) is assumed in the following fashion: 

every (input) element u of X g gives rise to a unique (output) element 

x of X such that 
e 

(7T t x)(s) + (7T t G(ir t x))(s) - (TT t u)(s) ; s, t £ R + . 
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Thus, by the projection property of ir the function ir t x is an element 
of X and is thus bounded for every t e R + . 

Having assumed the existence of solutions bounded over finite 
intervals, the system is said to be stable if bounded (on R + ) inputs 
give rise to outputs bounded over the entire time set. Or precisely: 
Definition 1 : The feedback system (1) under consideration is said 

to be X-s table if for any u E X the conditions hold: 

(i) the solution x corresponding to u is actually an element 
of X, 

and (ii) there exists a K < 00 independent of u £ X such that 

IMiVkIIuM. 

The nature of the definition is clarified by the following 
restatement: 

Theorem 2 : [65, Theorem 4.1] Assume that I + G has a causal 

inverse on X e> then a necessary and sufficient condition that (1) 
be X-s table is that (I + G) ^ be bounded on X. 

The theorem indicates clearly that the correct mathematical 
framework for the investigation df input-output stability is in 
that theory relating to the invertibility of (causal) operators. 

For example if the operator G is linear then the invertibility of 
I + G requires that -1 not be an element of the spectrum of G [12]. 
For linear, time- invariant convolution operators on several Banach 
spaces [11] the spectrum of G is the set (assuming g E L 1 (R + )) 

a(G) » LJ j e" 8t g(t)dt . 

Re(s)>0 J 0 
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The stability condition on G in this case is the familiar Nyquist 
Criterion. 

For the abstract equation (1) the need for general invertibility 
criteria led to the following theorem whose proof was perhaps initially 
motivated by some similar inequalities in the theory of Banach 
algebras (see Bachman (1, p. 34]). 


Theorem 3 ? (Small— Gain Theorem) For the equation (1) under the 

existence and causality assumptions suppose that G is a contraction 
on X; i.e. there exists a constant a < 1 such that 


sup 

feX 



< a < 1 . 


Then for any u e X the inequality 

G 

1 1 V s 1 1 * llv*!l 

holds for every t e R + . Hence, u e X implies that x e X and 
1 1 x 1 1 « (1-a) 1 1 | u | | , and so, that (1) is X-stable. 

The power of this si?aple and obvious result is only fully 
realized in its special cases, one of which is the Circle Theorem, 
a striking generalization of the Nyquist Criterion. Let the vector 
space V be R and define the (nonlinear) operator G as 

(Gx) (t) ■ [ g(t-s)f (s,x(s) )ds 
J 0 

where f : R x R -*■ R is continuous (separately) and the kernel g 
is locally L^R ) (absolutely integrable on finite intervals). Assume 
that the feedback equation u » x + Gx is well-posed (has a unique 
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solution) on the space X «* L (the extended space with normed 

e °°e 

subspace (L oo (R + ), I | * I !„»>> • 

Theorem 4 : (L^-Circle Theorem [75]) Assume the following: 

(i) u e <L o (R + ), | MU • 

(ii) For some r > 0 
1 ° 
r t 

e ° |g(t) | dt < 00 

(iii) For some constants a, b £ R + 

0 < a < - < b < 00 , for every t e R + , x e R . 

(iv) For G(s) the Laplace transform of g, and some r e (0,r Q ) 

the exclusion holds 

{-[ i (a+b)] -1 ,jO} t U 6(a)'. 

Re(s)>-r 

(v) For some r e (0,r Q ) 

sup |G" 1 (-*+J5) + \ (a+b)l> \ (b-a) . 

£eR 

Then x e L 00 (R + ) and | |x| \ m s? K| |u| | w for some K < 00 independent of u. 

Remark 5 : Conditions (iv) and (v) mean that the r-shifted Nyquist 

locus of G does not encircle (iv) or intersect (v) the closed disc in 
the complex plane centered at {-[ (a+b)] \j0) with radius (b-a). 

The theorem is valid on, for instance, L2(R + ) with «= 0; 

however, in the version to be used here (Theorems 3.2.3, 3.3.6) 

the assumption of "decaying memory" (ii) for the convolution seems 
necessary in the proof [75]. Note that if a « b the theorem reduces 
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to the Nyquist Criterion which is necessary as well. 

In section 3.1 a theorem similar to the Small-Gain Theorem is 
used to establish general conditions for the asymptotic invariance 
of the probability distribution of the solution of a general random 
equation. In sections 3.2 and 3.3 conditions like the Circle 
Theorem and the Nyquist Criterion are used to guarantee asymptotic 
invariance for the solutions of random convolution equations. Before 
proving this result it is necessary to describe precisely the structure 
of a random operator equations, and introduce a topology suitable for 
the analysis of probability dsitributions induced by random processes. 
These topics are discussed in the next two sections. 

2.3 Random Operator Equations ; 

A. Probability spaces : 

In this section the concept of a random operator as a model of 
a physical element with random parameters is rendered precise by 
defining it to be an operator valued random variable. Certain properties 
of random operators are noted and the nature of random operator 
equations investigated. Appropriate references for this section are 
the papers of Hans [31], [32], [33] and the survey of Bharucha-Reld [6]. 

Let (fi, ^ ,P) denote a basic probability space. When this 
triple occurs in the sequel, the assumptions below will be implicit: 

(i) (R,t) is a topological space, always separable,* t denotes 

the topology of the set R. 

* 

See [7, Appendix III] for the implications of this constraint. 
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(ii) Z} is the Borel a-algebra generated by the topology t. 

That is, the least sub-class of 2^ (the class of all the 
subsets of ft) closed under countable unions, finite 
intersections, and containing T. 

(ill) P is a probability measure, by definition a complete (subsets 
of sets of measure zero have measure zero) , countably 

00 oo 

additive (P{ U A ± } - l P(A.), A e !7 , A. f\ A - <j>, 
i“l i®l J 

i i j), finite (P(ft) < "), set function mapping & into 

R + , normalized so that P(ft) -1. 

For any measureable space (X, ^?(X)), here #(X) indicates the 

Borel 0-algebra of X, let F(ft;X) denote the set of X-valued random 

variables on fi; that is, the set of functions f : ft -*■ X and f is 

measureable in the sense that f -1 -g(X)c Z? , or that the inverse 

image of every measureable set is measureable. 

Example 1 : (Gaussian measure) Let (X, T0(X)) » (R,^?(R)) the real line 
with i3 (R) generated by the open intervals of R. Let f : R--» R be 
a continuous function (hence $ (R) measureable) and assume that the 
measure y^ is defined by 

* 

Then (R, & (R) , y f ) is a probability space and f is a Gaussian 
random variable. 


y f (A) 


P{w e ft:f(o>) e A e j@(R)} 


1 

/IF 


r 

* —a 


l A (x)e x ^ 2 dx 


denotes the characteristic function of the set A. 
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Example 2 1 (Wiener measure) Let (X, -#(X)) « (C(R + ) , 56 (C) ) where 
C(R + ) is the set of real-valued continuous functions on R + topologized 
by uniform convergence on compact intervals. -Q (C) is the least 
o-algebra containing sets of the form 

A(t;a,b) » (f e C : f(t) e [a,b) c R} ' , t e R + . 

A measure P is induced on l8( C) by its definition on such sets A. 

P{A(t;a,b)} » P{f e C : f(t) e [a,b) | f(r) r i s < t} 

- 1 e -[x-f(s)] 2 / 2 a 2 (t-s) dx 

/ 2 iro^(t-s) 

P is in this instance the Wiener measure [7]. Note that for s ■ 0, 
the assumption f (0) - 0 is standard. 

B. - Random Equations : 

The following definition was given by Hans [31]. 

Definition 3 . Let (J2, 5* ,P) and (X, i0(X)) be given, then a map 

T : ft x F(S);X) + X is a random operator if T(*,x(*)) ■ y(*) is a 

random variable (an element of F(ft;X)). 

Example 4: (Deterministic operators) Let G be a continuous map 

X into X. Then it is routine to verify that G : F(ft;X) -*• F(fl;X) 
and that every continuous deterministic operator is a random operator 
according to Definition 3. 

Example 5 : (Linear convolution) Let (X, j£?(X)) » (C(R + ) , t 6 (C)) 

and let g e C(R + ). Let w denote the Wiener process, and x e F(ft;C) 
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independent of w. Then the C-valued function y (on Q) defined by 
(its value at t) 

+ 

y(t,w) ■ g(t-s)x(s,6j)dw(s,oj) , t e R w e 
J 0 

is an element of F(Q;C). The integral on the right is defined in 
the Ito sense; its properties and a proof of the assertion here 
are given in Ito [40]. The convolution above defines a linear random 
operator on the space of C-valued random variables. 

An alternate formulation of the notion of a random operator 
may be given as follows: Let (X,d) be a separable metric space and 

Ji tX) the set of all continuous maps X X. Give to the set Jj (($) 
the (strong) topology T generated by the convergence G n ;+• G if and only 

if 

d(G n x, Gx) ^ 0 for every x e X . 

Let denote the Bore 1 o-algebra of subsets of j& ((. jfl) generated 

by this topology. Then for any (fi, & ,P) given, let F(fi; J) ) denote 
the set of jt) -valued random variables . That is , each element G of 
F(fi ;j t>) maps ft into j^(X) such that 2f(w)[.] ■ G w (.) e^J(x). Thus, 
for every u> £ ft G(w)t*] is a continuous map X -*■ X; and so, this 
definition coincides with Definition 3 on the continuous operators. 

Moreover, it is clear that probability distributions may be intro- 
duced on and convergence arguments made for random operators 

as well as for random variables in the usual manner. In the next section 
this possibility is investigated further and the preservation of probabilistic 
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convergence under random operations discussed. 

The use of random operator equations in section 3.1 necessitates 
a discussion of the nature of a solution to such an equation. 

Definition 6 ; For (ft, 7, P), (X, -#(X)) , G e F(ft; , and 

y e F(ft;X) given, then every element x of F(ft;X) satisfying 

P{w : G(w)[x(w)] ■ y(w)> « 1 

is a solution of the equation Gx » y. 

Thus, a solution is required to be a random variable; that is, 
to have certain measureability properties. This qualification has 
been the source of a considerable amount of research on the nature 
of solutions to random equations (see for instance Hans (32], 
Bharucha-Reid [6]). Most of this has been a consequence of the 
ambiguous nature of Definition 3. 

Assume that (X, ||*||) is a Banach space. An element G of 
F(ft;jfr(x)) is said to be a random contraction if there exists a 
real-valued random variable c such that c(w) < 1, for every weft, 
and 

I |G(w)[x 1 J - G(w)[x 2 ] | | < c(w)| | x 1 -x 2 | | . 

The analog of the Banach-Cacciopoli fixed point theorem [42, p. 627] 
in this setting is: 

Theorem 7 : [33] Let (X, || *||) be a Banach space, G e F(ft; jfo (X) ) 

a random contraction, then there exists an element x of F(fl;X) such 


that 
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P{o>: £(a>) [x(u>) ] « x(u))} «* 1 . 

The random variable x is unique almost everywhere (P), and may be 
obtained by the process of successive approximations starting at 
any initial element x q of F(ft;X). 

This result is then the basis of the proof of existence and 
uniqueness of solutions of random operator equations. For the 
equation Gx » y, given G e F(fl; M (X)) and y e FCft;X) if G may be 
shown to be a contraction, then Theorem 7 may be Invoked to assure 
the existence of a unique element of F(fl;X) (as a set of equivalence 
classes under P) as the solution. Moreover, the classical scheme of 
Picard iterations may be used to approximate the solution. This 
is a result of somewhat more subtlety than is apparent at first reading 
as it implies that the Picard iterates are at each step random 
variables, and they approach almost surely a random element which is 
the desired solution. In most cases of course only local existence 
and uniqueness may be established in this manner. 

C. Moment spaces : 

As the convergence arguments used in Chapter 3 utilize certain 
moment bounds, it is appropriate at this point to introduce a few 
definitions of "moment spaces" and consider operators on these spaces. 
Let (ft,5*,P) and (X, #(X),||*||) be given and denote by E(*) the 
usual expectation operator on the subset of F(8;X) for which 

Ex - I x(<o)P(du>) 

J n 



29 


is well-defined as a Bochner integral [72, p. 219]. 

In particular define the sets (of equivalence classes) 

c< q (fi;X) £ {x e F(fl;X) : |x| q - (E{ | |x| | q }) 1/q < «; q e [1,*)}. 

And in the case that (X, it * II ) Is a Banach space of real-valued 
functions on R + , the spaces 

£ q (fi;X;fc) “ {x e F(fi;X) : ||x|j q & - (A[E{ |x(t,w) | q }]) 1/q < » ; qe[l,«)}. 

Here H is any sub-additive linear functional on real-valued functions. 
Typical examples used here are 

(f) «» ess sup |f (t) J 
teR + 


Under these restrictions on l it is clear that | | • | | . is a norm 

q , X, 

and (£ , ||*|L 0 ) a normed linear space. Under the choices JL 

H 1 

and 6 is a Banach space as well, Thus, elements of £ (ft;X;f, a ) 
M q Jt" 

are (almost surely) bounded in q C ^ absolute moment. Elements of 
t q (fi;X;Jl 2 ) have absolutely integrable q th moments. See Ito [39] or 
Skorokhod [55] where similar spaces are defined and used in existence 
arguments for stochastic differential equations. 

Assuming that (X, ||*j|) is a space of functions closed 
under the truncation operation (7r t , see section 2.2), the "extended 

£(x+y) < £(x)+ fc(y), S-(ax) = |a|f.(x) x,y e R + , a e R . 


A 2 (f) 


r i«oi. 

1 n 
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spaces" £ 0 ■* {x e F(Q;X) : it x e £ , t e R + ) are convenient 

q** t q 

for certain statements pertaining to the existence of solutions in 
feedback equations. 

Let j$(X) denote again the set of operators mapping X into 
l&self continuously. For those elements G of F (H;j^) for which the 
supremum is finite define 




And under the assumption that X is a function space and Jj(X) consists 
of causal operators (see section 2.2), then for G e F(fi;>J$) set 

I Ml, 

Trarr • 

:q 


sup 
xe £, 

1 1*1 l q *o 


| G | | q depends on l. 


Note that in this case 

A few examples are given below to illustrate the definitions. 


Example 8 : Consider the space ^(fijXjf,^) and the (deterministic) 

operator G on X - C(R + ), the continuous functions, given by 

g(t-s)x(s,a>)ds 


(Gx)(t,w) = y(t 
Then 


,w) » [ 

} 0 


ff 

J o J c 

< ([ g(t-s)(E{x 2 (s))) 1 ^ 2 ds)‘ 
Jn 


Ey (t) » 


g(t-s)g(t-r)E{x(s)x(r))dsdr 


Hence , 
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iwi, * < r i*(t>i4t imi 4 , 

q **i }q q.*! 

and the bound Is attained. 

Example 9 : Consider ^ 2 C^5 x ?-2’2^ an( * the °P® rator G on X ■> LjCR*"), 

the space (deterministic) of square-integrable real-valued functions. 

Then, for y » Gx, 

ft + 

y(t,u>) - g(t-s)x(s,<*!)ds , t e R , to e il 

J 0 


where 


[ E{y 2 (t)>dt - [ E{(fg X <8 )dt 
1 0 J 0 - '0 


I [ |g(r)|dr[ |g(t-a) |E{x 2 (s)}ds dt 

Jo J o Jo 

( f |g(t)|dt) 2 • | |x | \ 2fl 


This bound, however, may be improved by taking into account 


the fact that for each w e fl the integral 


r ixa, 

•’0 


w) I dt < 


Hence, each sample function x(w) admits a Fourier transform. 


x(jw,to) x(t, 

J 0 


w)e~^ vt dt , w e fi v e R . 


Hetfc’ equality holds in the L 2 <R + ) sense. Assuming that g e L 2 (R + ) has 

A 

a transform G(jv), then for each co e ft 


! y 2 (t,w)dt £ sup |G(jv)| 2 j x 2 (t,w)dt 
Jo veR J 0 


and use of the Lebesque Dominated Convergence Theorem [16, p.151] 
for E(.) permits the conclusion 

I |y| l 2 £ £ sup |G(jv)| I |x| I . 

* 2 veR ,x 2 
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Moreover, this bound is attained. 

Example 10 ; Again for the case X » C(R + ), consider the random operator 
G on F(ft;C) given by 

(G(oj) [x(w)]) (t)«y(t,w) = [ g(t-s)x(s,o>)dw(s,to) , t e R + * u e Q 

Jo 

on non-anticipating functions (i.e. tt^x is independent of (I-Tr t )w 
for all t, see section 3.2) in The following calculations 

(1) E{y 2 (t)} - E [(j {g(t-s)x(s)dw(s)) 2 } 

= cr 2 [ g 2 (t-s)E{x 2 (s))ds 
J 0 

(here E{(dw(t)) } « a dt) permit the conclusion 

A 

where G is the Fourier transform of the kernel g. See, for instance, 
McKean [50] for details of the reduction of (1) which makes use of 
the decisive property of orthogonal increments of w. Extension of 
this idea is the basis of several moment inequalities proven and 
used in sections 3.2 and 3.3 below. 

2.4 Topologies for Random Processes : 

The appropriate topology for the convergence arguments of the 
next chapter is introduced in this section. The topology is the usual 
one generated by weak convergence on a set of measures and, following 
a brief discussion of the general case, its properties are discussed 
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for certain random function spaces including the continuous functions. 
The preservation of weak convergence under mappings is the final 
subject of this section. 

Consider the following question: If the random process x(t) is 

the limit in some precise sense of the sequence of processes {x (t)}, 

n 

then for some functional f is it possible to determine the distribution 

of f(x) i-f those of f(x R ) are known? In other words is the distribution 

of f (x) the limit in some sense of the distributions of f(x )? It 

n 

is clear that some regularity assumptions must be placed on f to 
make these questions meaningful, typical examples of functionals 
. f are 


f(x) - 

f 2 g(x(t))dt 
Jt l 

f(x) " 

sup |x(t) 


The techniques introduced below have been developed to answer 
questions such as these. 

Let J£X,d) be a complete, separable metric space and let 

■^d^O denote the class of Borel subsets of X generated by the 

d-topology. Let C(X) denote the set of continuous (in d) functionals 

on X. Let (ft, ,P) be the basic probability space and let x, x : ft -► X 

n 

be random variables. The distribution of x(x ) is defined as 

n 

M (n) (A) “ P * w e x (n) (w> e A e -jS d (X)} . 

Then a necessary and sufficient condition for convergence of the 
sequence of distributions of f(x Q ) to that of f(x) for all f e C(x) 
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is that 

lim J h(x)y (dx) ® [ h(x)y(dx) 
tt+0 ° 

For all bounded, continuous functionals h. This answers the question 
posed above subject to the restrictions imposed and makes further 
study of the limiting operation above of interest. 

let PM(X) denote'. the set of probability measures, 
and let BC(X) denote the set of bounded, continuous functionals. 

If for elements P n , y of PM(X) 

hd ^T, * hd M » for every h e BC(X) , 

J X n Jx 

then y n converges weakly to p, or y n £ y. This convergence is 
determining by the following: 

Theorem 1 : [7, p.9] Elements y, v of PM(X) coincide if J hdy - 

I hdv for every h e BC(X). Other implications are given in [7, Theoren 
•X 

2.1, p. 11). 

Let a subset A c PM(X) be called relatively compact if every 
sequence in A has a weakly convergent subsequence (whose limit need 
not be in A, though in PM(X)). This compactness definition will 
be used in Chapters 3 and 4 to prove the existence of invariant 
distributions for stochastic processes. The criteria for determining 
relative compactness in general metric spaces are due largely to 
Prohorov and are given below. A family of probability measures 
A C PM(X) is called tight if for every e > 0 there exists a compact 
set K(e) c x such that y(K(e)) > 1-e for every y e A [7, p.37]. 
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Theorem 2 : [7, p. 37] Let be a complete, separable 

metric space, A C fSKj&D, then A Is tight if and only if it is 
relatively compact. 

On 1HHCXS) define a neighborhood system via the following 
sets: for y e 188(6*0, e > 0, K e 2 + 

I > 

•'x 

h ± e BC(X) , i-l,2,...,k> 

Call the topology generated by these neighborhoods the topology 

* 

of weak convergence ; clearly y^ + y if and only if y R ■+■ y( 9T). 

A natural question to pose is: When is W metrizable? 

For y, v e PM(X) let 

e 1 - inf {e > 0: y(A) « v(N e (A)) + e} 

t 

where N^CA) ■ {x e X: d(x,A) < e), and A c X is closed. Let e ^ 
be defined by reversing the roles of y and V. Define 

L(y,v) « max (e^e^ 

Theorem 3 : [7, p. 238] The function L is a metric on PM(X) called 

the Prohorov metric. Moreover, the L-topology is equivalent to 
if the set X is separable. 

By defining the distance between two random variables to be 
the L-distance between their distributions a metric (L) may be 
defined on F(Q;X) the set of X-valued random variables. It is routine 
to verify 


N k,v (p) “ e ™<*>* I 


, dv- 




dy < e. 
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Proposition ; If {x q }, x are elements of F(A;2C) then 

P{we fi:d(x (to) ,x(u) ) ■+■ 0} ■ 1 
n 

implies L(x n ,x) « L(v» x ,P x ) -*■ 0. 

n 

Conversely, 

Theorem 4 ; [56] If {x^} is an L-Cauchy sequence (possibly defined 

on different probability spaces), then a sequence {y^} and y may 
be constructed on (fi, & ,P) such that 

L(x n ,y n ) ■ 0 and P{u»:d(y n (ui) ,y(w)) 0} ■ 1. 

Call a subset A - {x a , a E A} of elements of F(£2;X) indexed 
by A, totally L-bounded in (F(fi;X),L) if every infinite sequence 

r \ CO 

ix^ taken from A has an L-Cauchy subsequence. This property 

n 

is equivalent to the Induced distributions of (x a ) being relatively 
compact. Precisely: 

Theorem 5 : [53] For A to be totally L-bounded, it is necessary and 

sufficient that for every e > 0, there exists a compact subset 
K(e) c x (independent of a e A) with 

P{w : x a (w) e K(e)} > 1-e , a e A 

Or equivalently, that the induced distributions {y } be tight. 

x a 

Assume now that the metric space (X,d) is the space of R-valued 
continuous functions on R + (denoted by C(R + )) with the metric 

00 

d(f,g) - l 2~ n 
n»l 
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where ||h|| - sup |h(t)j. Then (C,d) is a complete, separable 

0*t«n 

metric space. In this case F(ft;C) is a space of random functions. 
The basic compactness result for measures defined on (C, -j@ d (C)) 
is given by the following: 

Theorem 6 : [7, p. 95] A subset A c F(A;C) is totally L-bounded if 

the following conditions are satisfied for any sequence {x } c A: 

n 

(i) the sequence (of distributions induced by) {x (0)} is 

Q 

tight; 

and (ii) there exist constants y > 0 and a > 1 and a non-decreasing* 
continuous function f on R + such that 

P{u: |x n (t)-x n (s) | > X} < |f(t)-f(s)|“ 

for all t, s e R + , n e Z + , and X > 0. 

Corollary 7 : The moment condition 

E{|x n (t)-x n (s)| Y } < |f (t)-f (s) |“ 

implies condition (ii) via Chebyshev's inequality. 

Corollary 8 : [41, p. 10] A subset A c F(fi;C) is totally L-bounded 

if there exist c > 0, c q > 0, n-l,2,...,A n e # d (C) such that, 
for every x e A 

(i) E{A 2 (0)} < c ; 

(ii) E{|x(t)-x(s) | 8 ; x e A } < c |t-s| 2 , 0 < s,t < n 

- . • , r. • n : n n 

00 

(iii) I (1 - P{w:x(o3) e A^}) is uniformly convergent on A. 
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These results will be used in Chapter 3 to investigate the 
behavior of the solutions to stochastic feedback equations* In 
that setting it is necessary to understand the transformation of 
distributions by operators on sets of stochastic processes. 

Let 'jfx, (x)) and (Y, iS (Y)) be measureable spaces and 
G : X -*■ Y a measureable function, for f e F(n;dO) let y f be the 
distribution indueed on iS (X) by f. Recall that 

y f (A) - P{o> : f( w ) e a e £(x)} - p(f“l) 

Then assuming G : F(fl;X) -*■ F(«; Y) for f e F(f2;^5), Gf induces in 
the same way a distribution on jg(Y) according to 

y Gf ( B > » P{u> : G[f (w) ] e B e tf(Y)} 

" P{w : f(w) e G _1 B e -tf(x)} 

- y f (c"^B) . 

If G is a random function the transformation is more interesting. 

Let (X,d x ) andOf,d y ) be separable metric spaces. Then (X,Y) is hhe 
set of operators g : X Y, continuous in the strong topology. Let 
Jj (X,Y) have the strong operator topology [16, p. 475], and let 
& (Jf) be the least Borel a-algebra induced by this topology. 

As in section 2.3,F(fl;^ ) denotes the set of (X,Y)-valued random 
variables. 

A criterion sufficient to guarantee the assumption G : F(fi;J{) 4 F(ft;Y) 
is the following 
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Theorem 9 ; [32] Let x be an element of F(fi;X) and let G e F(n;j&), 

then the function y : Q Y given by 

y(w) - (Gx)(w) = G(w)[x(w)] 

Is a random variable if G(w)[*] is continuous X -*■ Y for almost 
every u e fl. Thus, every G e F(fi; Ai ) maps F(Q;X) into F(fi;Y). 

For the random variable y - Gx a distribution is induced on 
-£(Y) according to 

Ug x (B) - p{w : (Gx) (u) e B e 10(Y)} 

■ P{w : #(u>)[x(a))] e B} 

Now by assumption (X,d x ) is separable, it follows that X has a 
countable base [7, Appendix I] that is, a family of open sets such 
that every other open subset of X is the union of a sub-family of 
these. Indicate this base by and assume (without 

loss of generality) that the A^ are pairwise disjoint. It follows 
from the Borel property of (X) (it is generated by the topology) 
that $ (X) is generated by ^ . Returninggto the expression for 
^ or G e-F(0;jfi ) , if follows from the last few remarks that 

y Gx (B) " P{ A i e^ {{w:x(w) e A i > f) (w :G(w) e (A i ,B)}]> 

00 

Here Jj (A^,B) c Jj (X,Y) is the set of operators g mapping X into 
Y and A^ into B. (The random variables x and G have been assumed 
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independent under P). The formadx;es&p3ssi8d.on 

y Gx (B) ° | (CnJ.B)) 

X 

follows from above. 

Now assume that X and Y are Revalued function spaces on R + , closed 
under the truncation operators {7T fc > £eR +* Let Jj (X,Y) be further 
restricted to include operators causal as well as continuous. Each 
element x of F(J2;X) generates a set of distributions {y_ on 

TT^X CtK 

T0(X) according to the rule 

y # x (A> “ e A} . 


And in the same manner as above for G e jt (X,Y) and Be iS (Y) 



ttGttx 
t t 





Assuming that (X,d ) and (Y,d ) as sets of functions are 

x y 

separable, metric spaces, and that the random operator G is an 
element of F (SI; j£| ), then the formal expression below gives 


t /At t 

the distributions induced on -J@(Y) by Gx for any element x of 
F(fl;X). 

As the final topic of this section consider the questions raised 
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by the transformation of a weakly convergent sequence of distributions 

by an operator. Precisely , let A c PM(X) be a relatively compact 

set of measures and let H be a function mapping X into itself: under 

what restrictions does H preserve weak convergence in PM(X) and 

relative compactness of A? A partial answer is given in 

Theorem 10 : (Tops(j>e [61]) Let (X,d, -®(X)) be a complete, separable 

metric space, H a measureable map from X into itself, and {y } °°. a 

n n**l 

weakly convergent sequence of elements of PM(X) with limit y. Then 

— 1 oo i 

the sequence {y (H •)} is weakly convergent (to y(H~ •)) if 

n=l 

H is continuous (modulo y). 

Though easily proved by examining the terms 

[ f (H(x))y (dx) ‘ ' 

J x a 

for f e BC (X) (that is, f(H-) e BC(X) if H is continuous), generalization 
of this result to the case where H is random is not straight-forward. 

For y e PM(X) and G e F(Q; & (X,X)) define 

»(«).[•] - y(G(w) -1 (.)) . 

In general let L denote the Prohorov metric on PM(X) and let 
■^(PM) be the &east Borel a-algebra generated by the L-topology. For 
any basic probability space (Q, ^ ,P)' then F(fi;PM) has the usual 
interpretation and is well-defined as a consequence of the metric pro- 
perties of L. Each element v of F(f2;PM) is polntwise a probability 
measure, v(w) e FM(X) for each u) e f2. 

Two definitions of convergence of F(ft;PM) are given in 
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Definition : (i) The sequence {\> n (w)} <z F(Q;PM) converges weakly 

almost surely to v e F(8;PM) if for every f e BC(X) 

lira ess sup | [ f(x)v (<o)[dx] - [ f (x)v(w) [dx] | ** 0 
n-* 08 (tteQ 'X n J X 

Denote this by v ■*_ V. 

n W * L « 

(ii) The sequence {v n (w)} <= F(fi;PM) converges weakly in mean to 
W e F(f2;PM) if for every f e BC(x) 

lim E{ I [ f(x)v (u»)[dx] - j f (x)v(w) [dx] | } - 0 

n ' J X n h 

Denote this by v •* v. 

3 n w,L x 

The next theorem gives conditions on the operator G e F(ft;j*S) so that 
convergence in the senses (i) and (ii) above is implied by Vi ft -► y in the 
Prohorov topology. 

Theorem 11 : Let (X, ||*||) be a separable Banach space, and let G e F(fl;>§(X)). 

(i) Then y -* y (in L) implies that 
n 

v [•] ■ y (G(w)) -1 *) -► v for some v e F((V;PM) . 

n,w l 4 ^n v ' 

00 

(ii) Let G e F(fl; ) be such that 
E{ j |G(u>) [xj 1 1 } < K| |x| | 

for all x e X and some finite K independent of x. Then y^ -*■ y 

(in L) Implies that v , v for some 9 e F(ft;PM). 

n w,L^ 

Proof : (i) Since G(w) e (X) (modulo P) , f(G(w)(*]) is an element of 
BC(X) for almost every u e ft and every f Z BC(X). Hence, for almost 
every w e Q 

lim [ f (G(u>)[x])y (dx) - [ f (G(w) fx])y (dx) | - 0 

n-~> Jx n JX 

which implies the conclusion for v(u>) ■ y(G(w) ^*)> 
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(ii) By the hypothesis of (ii) the integral 

I G(w) [x]P (dw) is uniformly bounded (||*||) 

■'fi 

in x. Thus, since the y n are probability measures (specifically, 
they are cr-finite), Fubini's Theorem implies the equality (for 
every f e BC (X) ) 

| f(G(w)[x])u (dx)P(dw) - f f f (G(w)[x])P(da>)y (dx) . 

J n J x n Jx h n 

Since f e BC(X) and G(o)) e $ (X) (almost surely), and by the assumption 
of (ii) , the function 

] f (G (co) [ • ] )P (du>) : X -► R 

J n 

la an element of BC(X). The conclusion follows using the reasoning 
in the proof of (i). 

m 

! • 

Remarks : (1) Thus, continuity of G(<a) on X, almost surely (P) , is 

sufficient to guarantee (w,!^) -convergence for G operating on L-convergent 
distributions V* n * Clearly convergence (w,L^) Implies convergence 
(W.I.J. 

(2) It is useful to think of the elements of F(fl;PM) as "random 
distributions." That is, assume that a number of control policies 
are available and that each of these is stochastic because of the nature 
of the task at hand. Then each of these possible policies may be re- 
presented by an element of PM(X), and if the control decision is made 
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at random it may be modeled as an element of F(fi;PM). In other words 
a control policy is chosen according to some probability law from a 
set of stochastic controls. See the paper [76] for some related 
definitions of relaxed stochastic controls. 

In the setting here the uncertain system "randomizes" the set 
of probability distributions representing the input and it is this 
point of view that is used in the latter portions of section 3.1. 
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CHAPTER 3 

ASYMPTOTIC PROPERTIES OF STOCHASTIC SYSTEMS 


3.1 Asymptotic Properties of General Feedback Systems : 

The results In this chapter summarize an analysis of the 
asymptotic properties of feedback systems described by possibly 
fandom operators and subjected to stochastic inputs. In this 
section the properties of general feedback systems are investigated, 
and a theorem akin to the Small-Gain Theorem (section 2.2) used 
to establish moment bounds for signals in feeHback systems. Under 
certain conditions on the system operator and the input the distributions 
of the feedback signals are shown to be asymptotically invariant. 

These results are reviewed in sections 3.2 and 3.3 for certain feed- 
back systems described by random convolution operators. In section 
3.4 a summary of the related existing theory for systems described 
by differential equations is presented. 

Before undertaking the analysis of the asymptotic properties of 
uncertain systems it is important to define precisely the nature of 
such a system in feedback form. First the notion of a proper 
signal space is required. Let 0 c R be a linearly ordered set, the 
time set. Let X be a set of R-valued functions on H , assumed to be 
Borel measuieable (i.e. x ^(-$(R)) C $( 0 ) for every x e X. Let 
0r t } te g denote the set of causal truncations introduced earlier, 
and denote by {£ t ) te g the set of anti-causal truncations . Assume 
that X is closed under both species of trunc&fcion. In that case 
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and y e F(fl;X) by continuity of G. Furthermore, for x e S(H;X), 

® e ^ C (X) and assuming GO *■ 0, then y e S(ft;X) because G is causal 
(and GO - 0). 

A feedback system will be specified by a set of inputs, a 
plant, and a feedback controller. It will be assumed here that the 
system signals have their values in the same space as the inputs. 

Tb e input space is defined as follows: Let X, 0 , and 

(ft. , Z ?. ) be specified as above and let {P } . (A an index set) 

a acA 

be a set of probability measures on (S^, S? .) . For each a e A, 
f e Stf^jX) induces a probability distributioAnon (X, $(X)) according 
to the rule 

y a,Tr t f (B) " p a* w e e B e jSCX) > . 

The input space is defined as an element of (S(ft, ;X) ,u ) 

1 ,H a / aeA 

For some choice ot e A. The flexibility allowed by specifying a set 
of distributions {Pg}^ rather than a single distribution reflects 
the empirical nature of the analysis of physical systems containing 
uncertainties. Frequently a number of hypothetical distributions 
for any uncertainty are proposed and some method of hypothesis testing 
used to determine the "best" of the candidates. This selection process 
should be regarded as preliminary to the analysis contained here. 

The plant is defined by the following procedure: Let (X) 

be specified as above, and let (ftj, ^ a ® easurea ble space (possibly 
distinct from (fi^,^^)). Let ^Pg^g e g be a set of distributions on 
■5*2’ P(&2»->4) tBe 8et of -^-valued random operators on ^ 

governed by the law induced on according to 
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|x(s) s > t 

(£ t *) (s) . - - 

i, 0 s < t 

or symbolically £ t = I - ir 

(I the identity on X). 


Giving 0 an appropriate topology (relative to R) X may be 
topologized and a (least) Borel 0-algebra T©(X) induced by the 
topology of X. To emphasize the fact that the systems to be studied 
here are to be considered as control systems, the set of signals 
is constrained to begin at some finite time. Thus* the set of 
signals admitted in the system is constrained as 

S(fl;X) « {f e F(fl;X) : = 0 for some t e 0 } . 

bet if(x) again denote the set of operators mapping X into 
itself. Indicate by j^.(X) the subset of consisting of causal, 

continuous operators. All systems to be studied here will by assumption 
be constructed from elements of (X) . Note, however, that this 
does not imply that the overall system will be either causal or 
continuous, and in general additional conditions will be required to 
assure preservation of these properties. See Willems [64] for a 
discussion of this feature of feedback systems which he calls well- 
posedness. Every element of ^ C (X) induces a natural map on F(ft;X) 
into itself using the continuity assumption and a natural map on 
S(A;X) by the additional restriction of causality. That is, for 
x e F(fi;X) , G e ^.(X) then 


y(o>) = G[x(w)3 
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Mg^ m (D) » Pg{w e : m(w) e D e 


by any element o of F(fl 0 ; ). The collection 

i c 


{F(« 2 ; .>^ c )»Ug}g eB 


is the set of plants "selected" according to the law y Q chosen as 

p 

fqptiraai'y ' accounting for the physical observations. 

For the purposes of this analysis the feedback operator is also 

assumed to be uncertain, though in design problems it usually may 

be freely chosen. Under this assumption the set of feedback controllers 

is specified in exactly the same manner as was the set of plants. For 

a given measureable space {frj, # 3 > and a set of hypothetical distributions 

^ P y^yer on ^3 a feedback controller is an element of F(ft 3 ; c ) 

governed by the law specified as best. 

For any element x of F(fl;X) let (tt^x) denote the least 

Borel algebra generated by tt x, s < t; in symbols 

8 


^ ( V )o s <t 

s,te 8 

The assumption of measureabllity of x assures that $(ir t x) <z 3 
for every t e 0 . 

Definition 1 ; Given a measureable space and a Set of probability 

measures {P a > aeA on a functional! h on F(ft;X) into itself is 
said to be a-non-anticipative if for every x e F(fi;X), (ir t [h(x)]) 
is independent of #(£ t x) for every t e 8 with respect to P . 

See for instance, Glkhman and Skowkhhd [29, section 3.3] for 
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a discussion of independence of set algebras. Call h non-anticipattve 

if h is a-non-anticipative for every a e A. 

Informally stated the definition says that values of the function 

h(x)(t) are independent of the future £ fc x of x, at least with respect 

to the distribution P . 

a 

Proposition 2: If h is causal, h(iT t x) - Tr t h(Tr t x), then h is non- 

anticipative. ' 

Definition 3: A stochastic dynamical system is a 4-typle 

(S^JX), ^ p a > aeA ; )» where S^jX) is the set 

of X-valued signals, {P } a set of distributions on (ft. , & , 

V-A 1 1 

{Pg} a set on (« 2 , ^ 2 > and F a set of J) -valued maps on 
fl 2 - Here each element G of is non-anticipative (with respect 
to <P a » on S(fl^;X) into Itself. Moreover, for each G e 
assume GO » 0. 

Definition 4: A stochastic dynamical system is said to be in feedback 

It may be written as the 6-tuple 

{s(!i r x) ’ {P a}o£A , {P 6 ) 6eB ; F<n 3 ,i), {p y } $er } 

where the components have the meanings and implications established 
above. Moreover, that the operator H selected on x fi ^ x & ) 

• J fc J 

according to {P g } x {p^} given by 
H(u 2 ,u) 3 ) ■ (I+K(w 3 ) - G(w 2 )) 

(G e F(ft 2 ;j^ ), K e F(& 3 ;jj?J )) is one-to-one and non-anticipative 
with respect bo {P a > on SC^jX) into itself. In addition HO ® 0. 

Clear from this definition is the observation that by identifying 
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C^2 * ^3* ^ 2 x 3 jf (Pg) x {P^}) with some space (ft, £?,{Pg}) the 
random operator H may be specified on ft by 

H(u) «* I + G<w) 

»v _ 

where G is a -valued random variable on ft. Moreover, by combining 
(ft^» ^^»{P a }) and (ft, ^ ,{Pg}) intthe same way, it is possible to 
define H and the input signals $(ft^;X) on the same probability space, 
governed by the same collection of probability laws. The conclusion 
of this argument is that, for the purposes of this analysis at 
least, it suffices to consider the random operator equation 

x(oj) + G(w)[x] ■ u(w) 

defined on some probability space (ft, 3 , {P 0 }) as representative 

of the feedback system under investigation. Here u,x e S(ft;X), 

u an admissible input, x to be studied, and G is a random operator 

on F(ft;X) into itself, non-antlcipative with respect to {P } . . More- 

oi oca 

over, for the purposes of the analysis to follow it is a useful 
simplification to assume that G(w) is an element of j$ c 00 (cj§ ), 
the causal, continuous operators on X. Thus, using Proposition 2 
above, the qualifier "non-anticipative with respect to {P a }" may be 
ignored for \«mih operators G. Finally, the assumption is made that 
by some decision process the "best’' distribution has been chosen from 
among {P 0 > A x {Pg> B x { p y )p on the product space (ft x x ft 2 x ft 3> ^ k ^ x 
Designate this underlying basic space by the customary symbols (ft, ^ , P). 
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and £ qe (&) (from here on the arguments fi and X will be omitted 

when not of central concern). Let G be an element of F(ft; ) 

and u any element of S(ft;X), then make the following: 

Assumption (Al) : (Existence of a locally bounded solution) For the 

equation 

(1) u(t,to) - x(t,w) + (G(a>)[x(o)])(t), t e 0 , a> e ft 

assume that u e £ (A) implies that x e $ (A) . 

qe qe 

That is, that ir.u e £ (A) for any t e 0 implies it x e £ (A) . 

t q t q 

As remarked the assumptions of causality and continuity of G 

on the function space X (and GO = 0) establishes that x e S(ft;X)). 

What is assumed here is roughly (dependent on A) the additional 
property that x has a "locally" bounded qth absolute moment. 

The following result is the analog of Theorem 2.2.3 (Small 
Gain Theorem) in t&is setting. 

Theorem 5: For the equation (1) above subject to the assumptions 

introduced with G e F(ft; J) c ) and u e £ q (A) fi S(ft;X), a sufficient 
condition that x e £ q (A) f) S(ft;X) is that 

I ( G l !q,A * aW K 1 

for some a(A) independent of u. 

Proof : By the assumption (Al) x exists and by virtue of the causality 

of (I+G) 1 on X, x is an element of £ (A) fl S(ft;X). Moreover, 

qe 

using the causality of G 

7r t x(w) = 7r t u(w) - 7r t G(a))[TT t x(uj)], t e 0 , w e ft 
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so x does not anticipate u and is a well-defined solution. Next 

using the triangle inequality property of | | • | | . as a norm, 

q , * 

it follows that 



The assumption on G permits 


Ml,,** ll\ u N,,* + aW) HVllq,* 


The restriction on a(Z) and the assumption u e £ lead to 

- 1 ) 


I Ml i,,* * u-owr 1 ! |u| i j 


Observing the right hand side to be independent of t e & tt follows 
that 

I ML® - «*up | |tt x| I * [l-aCA)]” 1 ! |u| ! o 

*• te H c q,x 

and hence, that the conclusion of the theorem is valid. 

QEB 


Note that the inequality ||x|| » < K j | u ] | . for some K < » 

q ,x q ,x 

is a "bonus" not required in the theorem. In deterministic stability 
theory this property (||x|| 6 K j j u| | > is sometimes called "finite- 
gain stability" and is frequently included as a condition in the 
definition of stability to preclude certain uniform boundedness 
arguments. See Willems [65] for a discussion of this point. Though 
not explicitly required above the finite gain property will be 
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decisive below, where certain assumptions on u are used to deduce 
properties of x other than boundedness (see the proof of Theorem 3.2.5). 

By the assumptions preceding the theorem I+G has a causal inverse 
on X or more generally on S(fl;X), and that inverse is locally bounded 
(maps -►.£ ), the theorem guarantees that the inverse is 

globally bounded (£ -*• £ q )* An important corollary to the theorem 

preceeds from the definition 


a(£.) *» sup 

x lf x 2 e 


j G V Gl 2M 0 q 


of the incremental gain of G e 
Corollary 6 : For the equation 


u 1 (a))-u 2 (oj)='x 1 (tij)-x 2 (03)-H3(w) [x 1 (w) ]-G(co) [x 2 (w) ] 

with u^, u 2 e £ qe (£) f) S(fl;X) and G e F(fl; jS c ) subject to the 
additional constraint 


V U 2 £ £ q,t n SaliX > 

a sufficient condition that Xj-x 2 e % 0 S(fi;X) is that a(i) < 1. 
Proof: $y assumption (Al) above Xj-x 2 e £ qe (*-) by the cau8allt y 

of the inverse of I4G on X, x^ e £ qe U) D S(8;X). Moreover, causality 
of G assures that * 1 ~x 2 does not anticipate u,j-u 2 and 80 that x.j-x 2 is 
well-defined as a solution of the equation. The remainder of the theorem 
follows directly from the definition of a(l) and the equation 
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ir t X 1 (w)-1T|X 2 (U))-TT t U 1 (o))-Tr t U 2 ((0)-1T t G(U)) [Tf^Cui) ]-HT t G(w) [TT t X 2 (w) ] 
along the line of the proof of Theorem 5. 

QED 

Remark 7 : That Theorem 6 is a more stringent requirement for a 

system that Theorem 5 follows Immediately from the observation 

a(£) < a(£) for every G e F(fl; £ ) (choose x = 0 in the definition 

C 2 

of &(£)). Thus, Theorem 5 may hold and Theorem 6 not. When valid. 

Theorem 6 guarantees that not only does I+G have a causal, bounded 

inverse on ' £ A , but also that the inverse is continuous. This 

property is essential in the sequel. 

Let 0 be the fixed set R + ■ [0,®) (another choice is 

° tt Q ,«) for some t Q e R). Let (X,d) be a complete, separable 
o 

metric space of functions mapping R + into R. Then with this choice 
of 0 it is possible to identify F(fl;X) and S(fi;X) (that is, all 
elements of F(fi;X) are for each w elements of S(fi;X); the opposite 
inclusion holds by definition). Moreover, for the two functionals 
mentioned earlier 

E,(f) - j | f (t) | dt, f e X 

J 0 

f, 2 (f) ■ ess sup |f(t) | 
tERt 

the spaces ^(fljX;^ 2 ) are Banach spaces. 

In the next two sections below specific choices of the space 
X (as the set of continuous functions, and as the set of piecewise 
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continuous functions) permit the use of bounds on the space t q (i 2 ) 

to make Prohorov's Theorem applicable to certain feedback systems. 

Theorem 8 here is intermediate in this process . 

Recall from section 2.4 the definitions of the Prohorov topology 

and the definition of totally L— bounded sets of random variables. 

Assume that (X, | | • | | ) is a Banach space. For He (X) define 

c 

the norm of H on X (distinct from the norm of H as an operator on 

£q> as 

P(H) » SUp 

o/xex 

and let X g ■ {f:R R : ir^f e X} be the extended space associated 
with X. 

Theorem 8: (Deterministic plant) Consider the equation on S(Il;X) 

u(oi) - x(w) + G[x(b>) ] 

where 

u e S(ft;X), G e J) c (X) 

and the existence of a solution x e X g such that ir fc x e S(fl;X) 
is assumed. Moreover, assume that the set of distributions 
* y 7 r t u^teR + indu ced by u on $(X) is relatively compost, then a 
sufficient condition that the set x } teR * Be Relatively compact 

(i) P(G) < 1 
(ii) (I+G)” 1 e J} c (X) 



is that 
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Proof : Condition (i) assures that the solution x(u>) is an element 

of X for every weft. The argument is familiar 

TT t x(w) a tr t u( 0 )) - TT t G[TT t x({U)3 

IIV (w) U < llv (w) H + !lv c V (w)1 II 

< I !«(w) 1 1 + P(G) I |7r t x(uj) 1 1 

Thus, 

I K t x(w) | | £ [l-p(G)] 1 1 |u<u>) 1 1 for every t e R + , 

weft 

and the conclusion is immediate. That x e S(ft;X) is a consequence 
of the facts that t^x e S(ft;X) for every t, and x - lim tt x. 

t-+oo 

Again using the causality of G, for every t e R + , d) e ft 

ff t x(w) + 7t t G[TT t x(w)3 - 7r t u(w) . 

Hence, for any A e $(X) 

P{w : Tr fc x(w) e A} 

- Hw * t t «-H?)" 1 [ir t u(w)] e A} 

» P{w : ir t u(w) e (I4G) -1 A} 

Where (I+G) A e $(X), since (I4G) ^ is continuous on X. Thus, 
the formula 

follows from the above equalities and the definition of induced 
distributions. 
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00 

Let It } , be an increasing (unbounded) sequence of elements 

n n=i 

of R and consider the sequence {y^ - } Let f be any element 

t 

n 

of BC(X), the bounded, continuous functionals on X, then 


L ‘ f f(y) \ u WW'Vl 


n 


n 


!. 


f[(I-KJ)y]y (dy) . 

X t 

n 

Since 1+G is an element of j$ c (X), the function f[(I+G)(*)] : X X 

is an element of BC(X). Moreover, since the set {y^ u ^ e ^+ 1® 

assumed to be relatively compact in the weak topology, there exists 

00 00 

a subsequence (unbounded) {t,},, C {t } . such that the 

n n m ± n n°i 

subsequence 


(J w«)yk „%>)„; 


V 


00 

converges. Hence, the original sequence {y_ } , has a convergent 

TT^ X n-1 

n m ' 

subsequence. The arbitrary nature of the set {t } , leads to the 

n n®! 

desired conclusion that (y^ x ^ teR + 1® relatively compact. 

t 

QED 


In other words the theorem says that on the function space X, 
totally ^-bounded (stochastic) inputs give rise to totally L-bounded 
outputs if the (deterministic) system operator I+G possesses a bounded, 
continuous, causal inverse on X. Boundedness of the signals is not 
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the usual notion of boundedness in norm, but a more refined concept 
defined in terms of the distributions induced on X by the signals. 

Although it may be considered as a rather direct consequence 
of Tops<j>e's Theorem (section 2.4) Theorem 8 serves a number of 
purposes. First it unites in a simple way the Prohorov theory of 
convergence and the detemninistic operator stability theory to give 
interesting results for stochastic systems. And it executes 
this union in such a way as to make directly applicable the deter- 
ministic stability criteria (at least in their incremental form) 
to problems in this setting. Secondly it again establishes the 
invertibility of operators as a key tool in the class of problems 
being considered here. In this way Theorem 8 is the analog of 
Willems' result (Theorem 2.2.2). Corollary 9 below makes the Small- 
Gain Theorem applicable in this general setting and provides the 
link to explicit criteria based on this result. 

Corollary 9 ; Let G be an element of and 


p(G) » sup 

x 1# x 2 eX 

x l” x 2^° 


I lv* 2 l I 



then the system operator I4G under consideration maps totally L-bounded 
inputs (u e S(fi;X)) into totally L-bounded outputs (x e S(fi;X)) if 
P(G) < 1. 

Proof; Clearly p(G) < p(G) and so (i) of Theorem 8 is satisfied. An 
easy calculation suggested in the proof of Corollary 6 shows that 
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(I+G) 1 is Lipschitz on x with Lipqehitz constant [l-p(G)]" 1 
and, hence, is bounded and continuous. Causality of (I+G)"" 1 
is assumed, thus (ii) of Theorem 8 holds. 

QED 

Examples illustrating the last four theorems are postponed 
until the sections following this one. In the remainder of this 
section the operator G defining the feedback system will be permitted 
to be random and the results from the latter paragraphs of section 
2.4 used to investigate the system properties. Thus, let G be an 
element of F(S2; j^) and let u e S(ft;X). Assume that G and u are 
independent under P. The properties of x defined by 

(2) x(w) + G (co) lx(co) ] - u<(u)) 

are at issue here. Referring to section 2.3 for comments on the 
existence and measureability of solutions to (2), the assumption 
of locally bounded solutions will hs usual be made. 

Assumption (A2$: For the equation (2) it is assumed that t^u e S(fl;X) 

implies that tt^x e S(Q;X). That is, that bounded, measureable 
inputs (it^u) give rise to bounded, measureable outputs, at least on 
finite intervals [0,t]. Bounded means in | | * | | on X. 

This assumption implies that for every w e fi, I+G(u>) has a 
locally bounded inverse on X g , and moreover, that this inverse maps 
measureable signals (elements of S(fl;X)) into measureable signals. 

Now let u } tERf denote the distributions induced by Tr t u 
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on lS(X). Put 

{v 7r t x (w)( *)} “ {y T rt u I(I4G( “ ))( ‘ ) ] } 

The Borel measureability of G e F(Q; Jj p ) assures that v e F(fi;PM(X)) 

* it X 

t 

(recall this notation from section 2.4). Then from these remarks and 
Theorem 2.4.11 the following result gives a partial description of x. 
Theorem 10 : For the equation (2) defined on the function space X, 

subject to the above assumptions on G, let u e S(fi;X), then by (A2) 
tr t x e S(fl;X) for every t e R + . Moreover, if p^(G) < 1 where 

l]G(w)[x ! l-G(a J )[x.]|| 

P.CG) ® ess sup sup rr- r i— 

“tA x lt x 2 eX I ! x l"* x 2 ' I 

X l~ X 2 ,i0 

hhen x e S(ft;X); and If the set u ^teR + * s relative ^y compact, 

(as a subset of the metric space (PM(X),L)>, then so is {v (u)) 

V 

In the (w,^) topology on F((i;PM(X)). 

Proof : Using the causality of G(w) for every cd 

ir t x(w) “ Tr t u(<o) - ir t G(w) [Tr t x(w) ] 

Thus, 

iiv n *iivii + iiv (w)[ v <u,)1 ii 

6 I l u l I + P X (G) * | K t x(w) || 

which proves that x e S(Q;X) when combined with (A2) (establishing 
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measureability of the truncated signals) , using a simple limiting 
argument . 

By a modification of the usual argument the condition p^CG) < 1 
implies that [ I+G (oj) ] ^ maps X into itself X (and is Lipschitz 
continuous) for each oi e fl. Let {y^ u > teR + be the distributions 
induced on -fi(X) by w^u. Then using Theorem 2.4.11 (i) the conclusion 
of the theorem follows. 


QED 


Corollary 11 : If p 2 (G) < 1 where 

|G(o))[x 1 ]-G(u))[x 2 ] 


P 2 (G) - sup E {- 
x lf x 2 eX 

x l“ x 2^° 


I I x 1 “ x 2 


then u > L-relatively compact implies that {v (w)} is relatively 
t 

compact in the (w.L^ topology on F(fi;PM<X)). 

Proof : In Theorem 2.4.11 put 


-1 


K - [l-p 2 (G)] a < • 


QED 


The lack of symmetry in these results renders them provisional 
in nature. In the next two sections this deficiency is avoided by 
specializing the random operator G to be a nonlinear convolution ia 
a special form. The space X is also restricted to be the continuous 
functions or the piecewise continuous functions. In the general case, 
however, this problem remains open. 
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3.2 Convolution versus a Wiener Process : 

In this section the general results of the last section are 
reconsidered for a special class of random operators formed by 
convolution versus a Wiener process. Three particular problems 
are analyzed here: For a general convolution versus a Wiener 

process sample properties are discussed and moment inequalities 
derived. For a nonlinear convolution equation moment bounds are 
obtained for the solution and a condition similar to the Circle 
Theorem (section 2.2) is used to guarantee the existence of an 
invariant solution distribution. Finally, as a corollary to the 
analysis of the nonlinear case a linear convolution is considered 
and a condition like the Nyquist Criterion given to guarantee the 
asymptotic invariance of the solution distributions. 

Let w denote the usual real-valued Wiener process on R + , normalized 
so that w(0) ■ 0. The Wiener measure w is a probability measure on 
(C(R ;R) , 18 (C)) satisfying two properties. For each t, s e R + the 
random variable w(t)-w(s) is normally distributed (on R) with mean 

E{w(t)-w(s) } « m(t-s) 

and variance 

E{[w(t)-w(s)-m(t-s)] 2 } - a 2 |t-s| . 

And for any finite collection of elements c R + such that 

t l $ t 2 6 *•* * c n < °° » the random variables w(t 2 )-w(t 1 ) ,w(t 3 )-w(t 2 ) , 

. . . are independent under (the measure) w. 

For any C-valued random variable x on (ft, ^ , p) let (x) c ^ 

8 1 
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denote the minimal Borel 0-algebra over which x(r) is oeasureable 
for r e [s,t]. Symbolically, 


■ U » x _1 ( 10(0). 

C rets.t) r 

In particular let $ 8t (dw) denote the least Borel o^algebra over 
which w(r)-w(q) is measureablg for s < r < q « t. 

Endow C(R + ) with the metric (see section 2.4) 


d(x,y) 


I 2 

n«l 


-n 


i x ~y I 


i + l l x- y I l n • 



sup |fc^t)|. 
te[0,n] 


Let f be a continuous functional on (C,d), and assume that the 
measureable function g:R * R + r is a causal convolution kernel, 
i.e. , g(t,s) = 0 for s > t. Then the operator 

(Gx) (t,oj) - [ g(t,s)f (s,(ir x)(w))dw(s,w) 

J 0 8 

is well defined as an I to integral [40] on non-anticipating random 
functions x € F(S;C), i.e., those for which # ot ( x ) v ]S ot (dw) is 
independent of -# t<o (dw) for every t e R + . (Here f> 1 V Q denotes 
the least Borel algebra containing both ^ and l6 2 ) • 

Let u e F(fi;C) be a non-anticipating random function in the 
above sense. As a special case of the general feedback equations 
of the last section consider the following equation. 

(1) x(t,u>) - u(t,w) - f g(t,s)f(s,x(s ,co) ) dw(s ,u)) 

J 0 
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Theorem 1: Conditions sufficient for the existence of a solution 

x € F(fl,;C) (with locally bounded second moments) such that 
•&t (x) v y0 ot (u) v ^ ot (dw) is Independent of "^^(dw) are 
that 

(i) |f(s,z)| 2 < a 2 (s)|z| 2 , zeR 

(ii) [ g 2 (t,s)a 2 (s)ds < oo 
J 0 

See [55, Chapter J] where a much more general existence theorem 
is proved using the usual Picard approximations. 

The properties of the moments of x are of fundamental importance 
in establishing the ultimate invariance properties of the distribution 
of x. The existence theorem above guarantees that the first and 
second moments of x are locally bounded (finite on any bounded interval 
[0,T]). The next theorem gives a bound on the entire half-line. 

Assume that f : R + x r ■+• r is continuous and that 

|f(s,z)| * |a(s)| | z | , z c R, 

for some real-valued continuous function a. Assuming the hypothesis 
of Theorem 1, the mean of x the solution of (1) evolves according to 

E{x(t)> - E(u(t)} - f g(t,s)E{f (s,x(g))}mds . 

J 0 

The6rem 2 : (i) If 

sup f |g(t,s) | |a(s)} |m|ds < a < 1 
teR + J 0 


then 
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sup E{|x(t)j} < (l-o) _1 sup E{ |u(t) I } 
teR+ t£R+ 

(ii) Let E{u(t) } » 0, m-0, then E{x(t)> « 0, and if 

2 ft 2 2 

sup cr g (t,s)a (s)ds « o. < 1 
teR + J 0 ■ 

then 

sup (E{x 2 (t)}) 1/2 « (1- v^TT 1 sup (E{u 2 (t)}M Z 
teR+ 1 t eR+ 

Proof : (i) This part of the theorem follows easily from the 

inequalities (assume m > 0) : 

E{|x(t)|> < E{ |u(t) | } + f |g(t,s) | e{ | f (s,x(s)) | )mds 

•'0 

<E{|u(t)|}+f |g(t,s) | |a(sj |E{|x(s) |}mds 

J 0 

(ii) The first statement of this part follows from the Theorem 1 
and the properties of the Ito integral [50, p.24], The remainder of 
(ii) follows from | 

(E{x 2 (t)}) 1/2 < (E{u 2 (t)» 1/2 

+ (E((f g(t,s)f(s,x(s))dw(s)) 2 }) 1/2 « (E{u 2 (t)}) 1/2 

+ 0(1 g 2 (t,s)a 2 (s)ds) 1/2 sup (E{x 2 (s)}) 1/2 
J 0 0is<t 

QED 

It is via bounds on the second moment that Corollary 2.4.8 is 
used to establish the existence of an invariant limit (in distribution) 
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for x. The remainder of this section will be devoted to a statement 
and proof of this property for two special cases of (1) corresponding 
to certain restrictions on the functional f in (1). The first result 
below gives an improved moment bound for the nonlinear case under 
these restrictions. 

Let (1) be replaced by 

ft’ 

(2) x(t,w) " u(t»w) - I g(t-s)f (s,x(s,u))dw(s,u>) 

J 0 

where g is now a time-invariant kernel and Theorem 1 is assumed to 
be in force. Assume moreover, that 

0 < a < — £ b < «o , s £ R + , z C R 

Theorem 3 : Under the additional assumptions that E{u(t)} » 0, 

E{dw(t) } - 0, for every t e R + the conditions: 

T -V 2 

(i) e g (t)dt < ® for some r < 0 ; 

J 0 ° 

(ii) for H(r+jv) ~ J e rt e ^ Vt g^(t)dt, and some r e (r ,0), 

J o ° 

the exclusion below holds 

( - 2a 2 (a 2 +b 2 ) 1 ,j0) ^ U H(r+jv) ; and 

veR 

r>r 

o 

2 2 

(iii) sup |H" 1 (rfjv)+ f- (a 2 +b 2 ) j > (b 2 - a 2 ) 

veR £ i 

- - 

for some r e <r ,0). 

o 

imply that sup E{x(t) 2 } <_ $sup_ E{u(t) 2 } for some finite g > 0. 
teR teR + 
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Proof : An easy calculation gives 

E(x 2 (t)} ■ E{u 2 (t)} + a 2 I g 2 (t-s)E{f 2 (s,x(s)) }ds 

J 0 

« E{u 2 (t)> + -| a 2 (a 2 +b 2 ) I g 2 (t-s)E{x 2 (s) }ds 

Jo 

+ o 2 f g 2 (t-s)E{f 2 (s,x(s)) }ds 
J 0 

where f 2 (s,z) ■ f 2 (SjZ) • (a 2 +b 2 )z 2 . 

By (ii) — a 2 (a 2 +b 2 )H(z+jv) ^ -1, thus by two lemmas of BeneS |[2>/, 
Lemmas 4,5, p. 32] the operator I + y ff 2 (a^fb 2 )H (H defined by (ii)) 
has a continuous Inverse represented by the identity minus a con- 
volution. Hence, 

2 

E{x 2 (t)} « (I +2j (a 2 4b 2 )H) (Eu 2 ) (t) 

+ a 2 I h(t-s)E{f 2 (s,x(s))}ds 
0 

where h is the function whose Fourier transform is H(jv) » 

2 

2 O s 1 

H(jv)[l+— j (a +b )H(jv) ] . An easy calculation verifies 

t ' . 

|f 2 (s,z)| < (b 2 -a 2 )z 2 

Thus , 

2 

E{x 2 (t)> < (I +j- (a 2 +b 2 )H) (Eu 2 ) (t) 

+ 7 0 2 (b 2 -a 2 ) f h(t-T)E{x 2 (s)}ds . 

'0 

Condition (iii) establishes 
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sup ~ a 2 (b 2 -a 2 )H(r+jv) < 1 


and the conclusion of the theorem follows from the L -version 

00 

of the deterministic Circle Theorem in Zames [75] (given in section 

2 . 2 ). 


Remark; Note that 


H(r+jv) 



G (r+j (v-v ) )G(r+jv ) dv 
o o o 


QED 


where 

G(rfjv) - 

And so, the criteria could have easily been stated in terms of 
the r-shlfted Fourier transform of g. 

The sufficiency of the following theorem is easily established 
using the techniques of the last proof. 

Theorem A : Consider the linear integral equation 

x(t,u>) ■ u(t,b>) « [ g(t-s)x(s,U))dw(s,w) , 

J 0 

then subject to E{u(t)} » 0, E{dw(t)} = 0 and Theorem 1, the 
condition 

a 2 I |G(jv)| 2 dv < 2 tt 

J— 00 

is necessary and sufficient to guarantee 

sup E{x 2 (t)> « y sup. E{u 2 (t) } for some y. 

teR^ teR + 


f e -rt e - ^ Vt g (fc)dt 
J 0 
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Proof ; (Necessity) Using the properties of the stochastic integral, 
the following equation is easily derived 

ft 

E{x 2 (t)} a E{u 2 (t)} + a 2 J g 2 (t-s)E{x 2 (s) }ds. 

J 0 

Rewriting this equation as 

y(t) ■> z(t) + f h(t-s)y(s)ds 
J 0 

where the L^-boundedness of y is at question, the conclusion (both 
parts) of the theorem follows from a result of Davis [11] and the 
observation that y is a continuous function on the half-line R + which 
follows from Theorem 1. 

QED 

By further specializing the input process u it is possible to 
use the criteria of Theorems 2 and 3 to establish the asymptotic 
invariance of the solution distribution. 

Theorem 5 : Consider the integral equation 

(3) x(t,w) - u(t,w) - f g(t-s)f (s,x(s,u>))dw(s,u)) 

J 0 

subject to the existence condition of Theorem 1. Assume that u 
and w are independent, E{u(t)} =0, E{dw(t)} - 0 and, moreover, 
that the process u satisfies the Lipschitz condition 

A . 

I u(t ,(i))— u(s ,u>) | * y| t-s | , y > 0, t,s e R + 

almost surely (w), and the moment bound E{u^(t)} < $ 2 < ». Then a 
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necessary and sufficient condition that the solution x of (3) be 
totally L-bounded in (S(n;C),L) is that 

E(x 2 (t)} * o 2 < «, t e R + , 

for some constant a > 0. Moreover, if u is stationary then x is 
asymptotically stationary with respect to u and the increments of w. 

Clearly then Theorem 3 gives a sufficient condition for the 
distributions of x to be bounded (or ultimately invariant) for nonlinear, 
conic functions f. Theorem 4 gives a necessary and sufficient 
condition in the special case of linear, constant functions f. Both 
criteria are stated in terms of the Fourier transform of g, and are 
thus subject to the usual design interpretations used for feedback 
systems including a linear, time- invariant , convolution operation. 

P-XbQf of the thfiPtm: The proof is based on~a lemma of I to and Nisia [41] 

stated as Corollary 2.4.8 above. It follows the pattern of a similar 
proof in [41]. The verification of the hypothesis of that lemma pro- 
ceeds in three steps, the first showing that the solution xoof (3) is 
totally L-bounded. 

Lenma_6: Let the kernel g be locally that is [ |g(T)| 2 dr < <» for 

+ ■'s 

t, s e R ; then there exists a constant n - n(e,T) such that for 

any e > 0, T > 0, 

P{w : sup |x(t,w) | > r|) < e » for every s e R + 

s<t£s+T 

Proof: From the definition of a solution 
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x(t) - x(s)+U(t)-U<s) - [ g(t-T)f(T,x(T))dw(T) 

* 8 

“ ( [g(t-T)-g(s-T)]f (T,x(T))dw(T) 

> 0 

And so, setting 

S- sup |x(t) | 
s£t<s+T 

the Inequality 

s < |x(s)j + |u(s) | + 8 up | u (t ) | 

s£t<s+T 

+ ®“P if g(t“T)f(T,x(T))dw(T)| 

s<t<s+T s 

+ «“P if Ig(t-T)-g(s-T)]f(T,x(T))dw(T)| 

s«t<s+T } 0 

-V + W + X + Y + Z 
follows. Thus 

p(s > n) « p(v > q/5) + p(w > n/5) + p(x > n/5) + p(y > n/5) + p(z > n/5) 

Now 

P(V > q/5) * | (E(x 2 (s)}) 1/2 * , 

and P(W > n/5)* in the same way. The analysis of the next three 

terms is somewhat more delicate. From the Lipschitz assumption on u 

|u(t,w)| 6 yft-s j + |u(s,u))| 


Hence, 
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sup |u(t,w)| € yT + I u(s ,(u> I 
S4WS+T 


and so, for ri > 5yT 


J{x > ^ } 4 P{|u(s,w)| > j - yT} 


IL 


n-5yT * 


For Y and Z consider the following 


P{Y > n/5} < <r 2 (^) 2 b 2 sup 

n s«t<»H J 9 


f^ T 2 2 

1 g (T)E{x Z (t-T)}dT 

1 a 


< a 


V <1> V 


g 2 (T)dT 


Similarly, for Z 


P{Z > n/5} < o 2 


<c 4 a 


* v i: 


w^.r 


[g(t-T)-g(s-T) ] 2 E{x 2 (t) } dT 


g (T)dr] 


Therefore, the bound for n 5yT 


P(a > n> r 4 1 (<2+6 ) + + 5 (4) 2 a 2 c 


ri ' n-5YT T 5( n r v 2 * 2 * 2 J 0 


9fT 


g (T)dT 


holds, and clearly for any e, T > 0 an n may be chosen sufficiently 
large enough to imply 

P{s > n} < e 


QEP (Lemma 6) 
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The second step in the proof requires verification of the 
following lemma. 

Lemma 7 : There exists a constant £ ■ £(m,T) such that for every 

t, v e [ s , s+T ] the following inequality holds (almost surely oj) 

E{ |x(t)-x(v) |* sup |x(r)| < m} 4 £|t-v| 2 
s<r*s+T 

if for every s, T e R + 

6.(1) “Sup (if g 2 (T)dT) 2 < » 

(¥t<T c *0 

6 2 (s,T) - sup (-“ J [g(t-v-x)-g(T)] 2 d ) 2 < ® 
s<v<t«s+T z v 


Proof: Again express the solution to (3) as 


x(t)-x(v) - u(t)-u(v) - g(t-T)f (x(T))dw(T) 

^v 

[g(t-T)-g(v-T)]f (x(T))dw(T) 

where the arbitrary assumption t i v has been made. Using 

(c+d) 4 « 8c 4 +8d 4 and the pointvlse assumption on u, the following 
obtains 

E{fx(t)-x(v) | 4 I sup |x(r)| < m} 

*s<r<s+T 



< 8y 2 | t-v | 2 



g(t-x)f(x(T))dw(x )) 4 


sup 

s*r<s+T 


|x(r) | A m) 
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+ 64E{(] [g(t-T)-g(v-T)]f (x(T))dw(x )) 4 j sup |x(r)| < m} 

* n *8<r<8+T 


- X + Y + 2. 


Now 


[ f 8^(t-T)g*(t-pl)E {f ^( x(t) ) f ^ (jc(v0)dw^(x)dw^(ij) } 

• XT * XT m 


Y - 64 


2/- -« .2/^ ft2/__/_ VVi 2/ /..vi ^ 2/_i .2, 


'v 'v 


where includes the conditioning sup |x(r)| < m. Thus, 

s<r<S+T 


Y < 64 o 4 b 4 


® 4 ( [ g 2 (T) dT) 2 < 64 a 4 h 4 mV (s+T) | t-v | 2 

J xt A 


By similar arguments 

A 


Z < 64a 


WV<f [g(t-T)-g(v-T)] 2 dT) 2 
Jn 


,4_4. 4 


* 64a m b H 6 2 (s,T) 1 1 — v | ' 


where <5^ and $ 2 are given in the hypothesis of the Lemma. Choosing 


£ - 8y 2 + 64a 4 b 4 m 4 [<5 (s+T)+6,(s,T)] 


satisfies the assertion of the lemma. 


QED (Lemma 7) 

Next the assertion that the solution x of (3) is totally L~bounded 
is verified. 

Lemma 8 : The conditions of Lemmas 6 and 7 imply that x is totally 

L-bounded. 

Proof : Denote by 0 s the shift operator 
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(0 x)(t) *> x(s+t), 

9 

and by (*) + the function (r) + » max (r,0). Using Lemma 6, define 
the constants n k ■» u(e(k),T(k)) = n(2"’ k ,2k+r) » then 

P{ sup 1 6 (x(t))| < n. ) 

-k-«t<k 8 k 

* p{ sup |x(t)| < n t ) * l-2~ k 

(s-k-T) + <«s+k K 

Let the function £ in Lemma 7 define the constants 
?k “ £(\» 2k+x), then from Lemma 7 for t,v e [(s-k) + ,s+k] 

E{|x(t)-x(v) | 4 | sup |x(t) | < n. > C C. |t-v| 2 

1 (s-k-T) + *t€s+k 

Define A. c C(R) as {h e C: sup |h(t) | < n.} , then 

-n-T<«n K 

I (0gX)(t)-(6 a x)(v) | 4 J 6 a x e A^) < n k |t-v| 2 

and the conclusion of this Lemma follows from Corollary 2.4.8. 

QED (Lemma 8) 

The remainder of the proof of the theorem follows from the 

last lemma. Let (PM(C),L) be the set of probability measures on 

C(R ;R) equipped with the Prohorov metric. Then from Lemma 8 the 

induced distributions {y 0 x > seR + on (C(R + ),$(C)) is relatively compact, 

8 

By the Lipschitz assumption on u the set {y 0 u ^ 8£ r+ *- s relatively compact, 
and setting (8 a w)(t) - w(t+s)-w(s) it is easily shown (using Corollary 2.4.8) 
that (yg w > is relatively compact. Recalling the fact that the direct 
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product of (relatively) compact sets is (relatively) compact, then 
the set of distributions {U s } aeR + induced (cxcxc, j8(C*CxC)) by 

(9 x,6 u,0 dw) is relatively compact. This establishes the first- asser- 

9 S S 

tion of the Theorem. 

Now using the fact that 0 h is continuous in (s,h) on R # C(R + ) 

S 

(for the metric d) , the function y^j(A) is measureable on R + for any 
8«t A 6 $ (CxCxC). Hence, the funCt±on(46f !:k) 




V t ( A ) 



ds 


is continuous on R + for any A as above. 

Since the set {y g } is relatively compact, by Prohorov's Theorem 
(Theorem 2.4.2 here) for any e > 0 there exists a compact subset 

K(e) c (C X CXC) (R + ) independent of s £ R + such that y (K) > 1- e and 

* . ® ' 

therefore such that V fc (K) > 1- e for every t e R + . Thus, the set (v t } teR + 

is relatively compact, and there exists a measure Ho e PM(C*CxC) and an 

increasing sequence {t n }” ol such that ^ Ho > or equivalently in 

n 

the L-topology. 


Let (x,u,w) be the (Cxcxc) (R )-valued random variable whose prob- 
ability law is V M . It remains to show that 

(i) (u,w) « (u,w) 

(ii) x is stationarily correlated with respect to (u,w) , 
and (iii) 2 ■ u - Gx 

Point (i) follows from the stationarity of u and of the increments of w. 
To show (ii) consider continuous, bounded functionals 
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k m n 

on R , R , R respectively and the series of equalities 

^l^t.+t** • * » x t. +t ), *'2 (u t:+t* * * * » u t , +t ) ^3 (w tv+f * * * ,w t"+t ) * 
1 k 1 a 1 n 

■ I r j 0 ' ds E(,( 'l <i t 1+ t+8 ) i.l , <’2 (5 t«« +S > j.l 


* r 


is E{ ' > l (i t 1+ s > *2 (S tj+a ) } 


lio £■ 
T 

i r*-® i 


1 r T * 


“ E{i^(x^ »***» X £ ) 4 , 2^ u ^» » • • • » u £ •) • »w ») } 

1 k 1 m S 1 n 

Here the third equality follows from the symbolic decomposition 

’ I 

t 

and the boundedness properties of (x,u,w) over finite intervals. That this 
series of equalities for all ^ 1 »^ 2 »^3 determines the properties of 
the finite dimensional distributions of (x,uy$) is fundamental, see 
Gikhman and Skorokhod £29 , Chapter 3]. 

To show (iii) it suffices to show that for every s £ [0,t] 

(ill)’ x(t) » x(s) + u(t) - u(s) -<Gx)(t) + (Gx)(s). 

An argument used in Ito and Nisio [41] may be applied directly at this 
point to yield the desired conclusion. 

' QED (Theorem 5) 

In the event that the function f is linear (f(z) ■ az, a > 0) 

Theorem 5 may be sharpened using Theorem 4 to prove: 
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Corollary 9 : For the linear integral equation 

x(t) *» u(t) - a [ g(t-s)x(s)dw(s) 
j ° 

subject to the assumptions on u, w, and g expressed in the hypothesis of 
Theorem 5, a necessary and sufficient condition that the distribution of 
x be ultimately stationary is that 

(4) a 2 a 2 [ | GCjv) | 2 dv < 2 tt . 

J — 00 

3.3 Convolution Versus a Levy Process : 

The most immediate modification of the integral equation investigated 
in the last section is to consider the convolution operator with the 
Wiener measure replaced by a Levy measure, representative of the most 
general process with independent increments. As is well-known [29] the 
Levy process has sample paths with at most countable jump discontinuities 
in any finite interval. Moreover, it may be decomposed into a linear 
combination of a Wiener process and a general Poisson process. In a feed- 
back system jump process may be considered as models of random shock 
phenomena and Levy process models as descriptive of combinations of con- 
tinuous and shock random signals. It is therefore appropriate to review 
the properties of such processes, whose sample paths are quite different 
from those of the Wiener process and its transformations. 

Let ^n*n=l be a set of independent, identically distributed random 
variables on some probability space (ft, ^,P) . Assume that the distri- 
bution function of the £ is 

'•i 
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V * 4 * P{ “ 1 £ » <U> ^ a) ‘ 


-1 


1 - e”^ a ; a > 0 

.0 


a < 0 
n 


where \ > 0 . Note E{£_) " X . Let S^Cuj) “ J ^(w) 

i=l 


then the distribution of S is 

n 


F (a) 
n 


1 - e 
{0 


”Xa n r 1 (Xct) 1 * 

. *- n k! 

k“0 


; a > 0 

; a < 0 


A Poisson process x(t,uj) » t e R , oj eft, may be defined via 


X(t,0)) » 


max{k: S^Coj) <: t} , S Q (u)) “ 0 
00 , if S k (w) < t for all k 


Note that x(t,a>) ■ n if and only if S n (w) <. t and S n+i (u>) > c • Thus, 
the induced distribution of x is 


p{w: x(t,(*)) ■ n) » J 


r (xt) n e -xt 

n! 

L 0 


n »» 0, 1, 2 , , 


n a oo 


2 

From this expression E(x(t) } ■ Xt, E{.(x(t) - Xt) } = Xt. Intuitively, 
the Poisson process represents a quantity increasing by unit jumps 
occuring at random instants of time. 

A somewhat more general process which accounts for random jump 
amplitudes is defined as follows. Let be a set of independent, 

identically distributed random variables with common distribution function 

p^(a) « p{o): n(w) £ a eR} . Let x be a Poisson process defined as above, 
independent of the n k » and governed by parameter X > 0. A compound 
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Poisson process y may be defined by the expression 


y(t,(o) 


x(t,U)) 



V“> 


x(t,w)>^ 1 
x(t,w) * 0. 


In words, y(t,to) jumps by n^Oa) at the instant that x(t,ui) changes from 
k-1 to k. The distribution function of y(t) is determined as { y 36] 


F y ( t j(a) - P{w: y(t,us) < a> « l 


n»0 


IMI F^ n) (a) 
ru 


where *» p^ 11 - 1 ) * p and F^ ■ F (* denotes convolution). 

n n 

Continuing the reasoning of the previous sections, the paragraphs 
that follow define an operator capable of describing the presence of 
"random shock's"in a feedback system. The asymptotic properties of such 
systems are then analysed using this operator. 

Let (X, 1B(X)) be a measureable space and consider the random 
measure on x J§(X) denoted by v([s,t],A), [s,t] c R » A e ^(X), 

as expressing the number of events in the set A during the interval [s,t]. 
Assume thattfche random variable v takes on non-negative values Independent 
on disjoint sets from -J@( R + ) x $(X). And for each set [s,t] x A e«0(R + )xj&X) , 
assume that v([s,t],A) is Poisson with parameter £* II(t,A) dt ; i.e., 

i t t 

P{w: v(oj, [s , t] ,A) = n} « - 5 JI(t,A) dt) exp(- / n(x,A)d T ). 

Here II(t,A) is a probability measure for #(X) for each t e R + , and a 
m e a sureable function R."^ -*■ R for each A e 58 (X). 

It follows that the random process v is a process with independent 
increments (on R+) ; so the stochastic integral 
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A(tr»x) v(difjds) 


for non-anticipating random functionals i on R+ x X such that 


i: l 


E | £(x»x) | k n(T,dx) dx 


1,2; t e R 


is well-defined as the usual limit of Riemannsums, see also Ito [40] and 
Gikhman and Dorogovcev [28]. 


Let 

v(t,A) ® 


f t 

(i) 

E{ f 

Jo 


r t 

(ii) 

E{( 


v([0,t] ,A) 


t 


; 

0 


n(t»A) dx 



&(t,x) v(<Jtr,dx) } o 0 
£(t,x) v(dx.dx) ) 2 >- fz 

Jo 


, then the following hold 


L 


E | l(x»x) | n(t,dx) d T 


Now let the process x be defined on R x into X as a non- anticipating 
( ^0 Ot (x) V^B 0t (v([0,s] „•)) is independent of $ too (v([s,t] f O) ) func- 

tional of V. Let H be an operator on X-valued non-anticipating random 
functions behaving as follows: if the "input" to H at time t is x(t) , 
then H causes a displacement of x by 


[ [ h(t,x(s),y) v(dt,dy) 

* 8 J* 

over the interval [s,t] c B.t. Here h is some (continuous) function 
mapping R + x x x x ■+■ X. 

Recalling the definitions of the last section, the remainder of this 
section is devoted to an analysis of the integral equation (1) below as 
a model of a stochastic system with unify feedback (here the space X **R). 
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(1) x(t,w) D u(t,w) - [ g(t-s)f(s,x(s,w) dw(s,ui) 

Jo 

- [ g(t-s) ( h(s,x(s,ui) ,y) v(w,ds»dy) 

J R 

From Skorokhod [55] the following existence theorem gives conditions under 
which the equation (1) is well-posed. 

Theorem 1 : [55, Section 3.3] Assume that' the functions u,g,f,h satisfy 
the following conditions: 

(i) u(<£)> for each oi e Q has only finite jump discontinuities (u is 

real-valued) , and E{u(t) 2 } < 00 for t e [0,T], T finite. 

(li) There exists a K < 00 such that for all t e R + 

[ | g(t— s) | 2 |f(s,x) - f (s,y) | 2 ds 

J 0 

+ ( | g(t— s) | 2 [ |h(s,X,a) -h(s,y,o)| 2 n(s,da) ds 

Jo Jr 

n 

< K|x-y| ; x,y E R . 

(iii) There exists a K < » such that for all t e R + 

[ jg(t-s)|[ |h(s,x,y) j II(s ,dy) ds < K(l+|x|) x e R. 

J 0 Jr 

Then a solution x of the integral equation (1) exists, is locally bounded 
almost surely, and has only jump discontinuities. Moreover, if 

9 2 x 

sup E{u(t> } < °°, then sup (E{x(t) } < 00 for any T e R . The solution 
0<t<T 0<t<T 

x is unique at all points of continuity. 

Before proceeding to the analysis of the nonlinear equation (1) con- 
sider the linear case (corresponding to f and h linear) 
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(2) x(t) «= u(t) - ( g(t-s) x(s) dw(s) - [ g(t-s) x(a) f h(y) v(ds,dy) 

Jo Jo Jr 

where 

E{dw(t)} « m dt 
E{(dw(t) - mdt) 2 } ■ o 2 dt 

E{v(dt,A)> - n(A) dt 
E{(v(dt,A) - n(A)dt) 2 ) - II<A) dt. 

Assume that u, w, and v are independent processes. Then dearly, assuming 
Theorem 1 holds, 

E{x(t)> » E{u(t)} - [ g(t-s) E{x(s) } m ds - ( g(t-s) E{x(s) }/h(y)n<dy)dt 

JO Jo R 


Hence, 

*4* 

Theorem 2: Assume that g e L^(R ) and let G(s) denote the Laplace trans- 

form of g. Then E{|u(t)|} < «> Implies E{|x(t)|} < <*> if and only if 

(-(mHO _1 ,jO) rf U G(s) 

Re(s)£R + 

where w ■ h(y)II(dy) . 

Jr 

Now consider the problem of bounding the second moment of x. An easy 
transformation of equation (2) gives 

(3) x(t) - u(t) - [ g(t-s)x(s) timhrjds - [ g(t-s)x(s)d«(s) 

Jo Jo 

- [ g(t— s)x(s) [ h(y) v(ds,dy) 

Jo Jr 

where dw(s) '■ dw(s> - m ds and 3(ds,dy) a y(ds,dy) - n(dy)ds. 

Assuming now the conditions of Theorems 1 and 2, the following holds 
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x(t) “ (G.u)(t) - [ g(t-s)x(s)dw(s) - [ g(t«s)x(s) [ h(y) v(ds,dy) 

1 Jo Jo Jr 

where is the linear deterministic convolution whose kernel has 

* — 1 - 

Fourier transform G^ja) <= [1 + (m+7r)G(ja)] and the kernel g has 
Fourier transform G(ja) “ G(ja)G^(ja). In this case 

E{x(t) 2 } ■ | | g i (t-s)g 1 (t-r)E{U( 8 )u(r) }dsdt + J g 2 (t-s)E{x(s) 2 }(o 2 +ir)ds 


from which the following is clear. 


Theorem 3 : Let g e L^(R + ), and assume that Theorems 1 and 2 apply, then 

2 2 

sup E{u(t) } < 00 implies sup E{x(t) }' < » if and only if 
teR teR 

(i) (-(m + Tr) _ 1 ,jO) £ U G(s) 

Re(s) eR 

and (ii) ||g || 2 < (ff + a 2 ) -1 ^ 2 
or equivalently. 


(ii)' 




m)G(ja) 


da 


ty 

< 2 ir(ir + cr) 


As an illustrative example, consider the linear convolution represented 
by G<s) « k/(s+p) , then 


JL 

2 ir 





GTjaJ 


2 



2 (p + m -fff) 


2 9 a. 

Hence, sup E{x(t) } < 8 sup E{u(t) } for some 6 e R if and only if 
teR + “ teR + 

k 2 - (a 2 +ft) (m 4-ft) 

* 5 < p. 

2 (it + cr) 
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Two sufficient conditions were proved in [66] for a special case of 
(2) (corresponding to vs 0); these may be modified to apply in this case, 
and they yield conditions more easily checked for a given kernel g than 
the criteria! (ii) or (ii)' of Theorem 3. 

2 

Corollary 4 : Assume that g e L^(R + ). Then for equation (3) sup + E{x(t) } 

teR 

£8 sup E{u(t) 2 } for some 6 e R + if there exists a y e R such that 
teR + 

2 ~ 

(i) q -- - -- g(0) < 1 , 

mfTT +y 

(11) and either of the following conditions is satisfied 

(a) (m + tt)/Y > 0, and the Nyquist locus \J G(ja) lies inside 

aeR 

the circle centered on the real axis of the complex plane 
at y _1 , JO) and passing through the origin. 

(b) -1 < (m + if)/y < 0, and the Nyquist locus G(ja) lies 

aeR ^ 

inside the disc centered on the real axis at (^Y ,J0) and 

passing through the origin. 

(c) (m +it)/y < -1, and the Nyquist locus G(ja) does not 

aeR i 

intersect or encircle the disc centered at (^Y »j°) passing 

through the origin. 

Proof : By Theorem Shit suffices to show that 
#00 

(a 2 *Hr)-jjp- j |G(ja)/(l + (m -Mr)G(ja))| 2 da < 1. 


Using the restrictions on the graph of G(ja), it follows that 
(m ± w)G(jc) 1 2 <' [1 + t(b + Rc 


1 + (m + ir)G(Ja) 


1 + (m + tr)G(la) 
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Thus , 




oq a) 

(m + ir)G(ja) 


2 

do < 


2 A 

afjvTr_ _1 
m + y 2 tt 



G(^> 

1 + (m + If)G(ja) 


da 


a 2 + ft 

m +ft + Y 


g(O) 


The last step using g e L^(R + ) , and the assumption that zero is a Lebesque 
point of g [1, p.5). 

QED 


The next result is a special case of Corollary 4 as y -*■ 0. 

Corollary 5: Assume that g e L, (R + ) , then for equation (3)„ sup E{x(t) 2 } 

A t£R + 

< $sup E{u(t) 2 } for s<kne B e R + if 
"" te» 

(i) m + It > (o 2 + tf) g(0) 
and (ii) ReG(ja) > 0 for all a eR. 

While Corollary 5 involves a "passivity" property of the operator G, 
Corollary 4 is reminiscient of the various "circle criteria" introduced 
above (sections 2.2, 3.2, and Theorem 6 below), and its primary use is 
to provide easily verified conditions for moment bounds in the equations 
being considered. That is, for any of the integral conditions given above 
(Theorem 3, Theorems 3.2.3 and 3.2.4) sufficient conditions may be derived 
directly in terms of restraints on the kernel g rather than the quantity 
| |g | J - appearing in the results mentioned by using arguments similar to 
those in the proof of Corollary 4. 

Returning then to the analysis of the nonlinear equation (1) , assume 

that 
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E {dw(t) } ■ 0 
E{[dw(t)l 2 } - o 2 dt 

E{v(dt,dy) } » !l(dy)dt 
E{[v(dt,dy) - n(dy)dt] 2 > - n(dy)dt 

and that there exist constants a,b ,c,d such that 

0 < a f (t,x)/x < i b<<»j te R + » x e R > 

•f* 

0 < c < h(t,x,y)/x £ d < «; t e R » x»y e R • 

Moreover , assume that E{u} » 0 and that u, w , and v are independent processes 

Theorem 6 ; For equation (1) under the assumptions of the last paragraph 

sup E{x(t) 2 }< Bsup E{u(t) 2 } for some & eR + if 
teR + ” test 

(i) There exists an r o » 0 such that 

| expert) | g(t)| dt < ® . 

(ii) ir - / R II(dy) < ® 

(iii) {Mr(c*d)/2}~J0!} t U G(s) 

Re(s)>-r 

ssa O 

(iv) For G(s) » G(s)[l + -|ir(c+d)G(s)] ^ (see (iii)) and G 2 c ® * 
then 

f(-ko 2 (a 2 +b 2 ) + if(c 2 +d 2 )])” 1 ,jo] $ U G£(s) 

1 2 Re(s)>-r 2 

o 

(v) For some o e(0,l) and r e (0,r^) 

sup |G’ 1 (s) + |ta 2 (a 2 +b 2 ) + i»(c 2 +d 2 )]| > § [a 2 (b 2 -a 2 ) + ir(d 2 -c 2 )] 
Re(s)>-r 


oi 
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and (vi) For some r e (0,r Q ) and 


H(t+jO 


f” G(r + j 
J-« (r + 


r + j(5 ' C 0 )G(r + j? 0 ) 
(r + i'U - Z n M* + JU 


j H(r + j© if 2 (d 2 -c 2 ) i - 

sup 1 2 2 2 2 2 ~ ^ 1 “ CX 

££R 1 + -|[o (aW) + if(c +d^)G 2 (r+j0 

Proof ; A transformation of (1) gives 

x(t) « u(t) - [ g(t-x) f h(x,x(x),y) n(dy)dt 
Jo J-oo 

- [ g(t-x)x(x)dx - [ g(t-x)f (x,x(x))dw(x) 

Jo Jo 

- | o g(t-T) h(+,*<T),y) vMt.dy) 

where \}(dx,dy) => v(dx,dy) - II(dy)dx and K(t,x,y) - h(t,x,y) - y(c+d)x 
and if is defined in condition (ii). Let W(b) ■ [1 + -|if^c+d)G(s) ] , then 
by (ii) and (iii) and from, for example, [12], W * exists on L^R*) func- 
tions. Hence, 


x(t) - <W 


* t 

(W^uXt) - g(t-x) h(x,x(x),y) I!(dy)dx 
Jo J — OO 

- f g(t-x)f (x»x(x)) dw(x) - f g(t-x) [ h(x,x(x) ,y)v(dx,< 

Jq Jo J-® 


where G the Fourier transform of g is defined above. Then taking into 
account the assumptions on u, w, and V\ 
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E{x{t$ 2 } - E{(w" 1 u) 2 (t)> + E{( [ “g;(t-x) f h(x,x(x),y) Il(dy)dx) 2 } 

JO J-c» 

+ [ g 2 (t-x) E{f 2 (x,x(x)) }a 2 dx 

Jo 

+ [ g 2 (t-x) [ E{h 2 (t,x(x),y) }n(dy) dx 

Jo J-<*> 

Again adding and subtracting the terms 

|a 2 (a 2 +b 2 ) [ t g 2 (t-x) E{x(x) 2 ) dx 

t Jo 

|ir( C 2 +d?j f * 
z Jo 


g 2 (t-x) E{x(x) 2 > dx 


the result is 

1 
2 


E{x(t) 2 } + ±[a 2 (a 2 +b 2 ) + ir(c 2 +d 2 )] f g 2 (t-x) E{x(x) 2 } d T 

Jo 


® E{(w" 1 u) 2 (t)> + E{( | g(t-x) |°° hCx.x(x)y) n(dy)dx) 2 } 

0 i 

+ d 2 f g 2 (t-x) E{? 2 (x,x(x))> dx 

Jo 

+ [ g 2 (t-x) [°° E{fi 2 (x,x(x) »y) II(dy)dx 

JO J-oo 

where ? 2 (’t,x) « f 2 (t,x) - ^(a 2 +b 2 )x 2 and h 2 (t,x,y) - h 2 (t,x,y) 

1 o 2 2 a 

- -|( c +d )x . Setting K to be the linear convolution operator whose 
Fourier transform is R(s) «* [1 + ■! > [a 2 (a 2 +b 2 ) + ir(c 2 +d 2 ) JG 2 (s)] 1 , 

where G 2 (s) is defined in the theorem statement, and using (iv) 
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(4) E{x(t) 1 2 } «* KCEfCW^u) 2 })^) 


+ f k(t-T)E{( f g(T-s) [ h(s,x(s),y) n(dy)ds) 2 } 

Jo Jo J ** 00 

+ a 2 f g(t-s$. E^ f 2 (s,x(s)) } ds 

Jo 

t t A f “ A , 

+ g(t-s) E{h (s,x(s),y)} II(dy) 

Jq J-~ 


ds 


where g is the kernel whose transform la G(s) *» (Sj^CsjKCs) and K k. 

Using the bounds, 

|f 2 (s,x)| < i (b2 " a2)x2 * for every 8 e B ^» 

|h 2 (s,x,y)| < i(d 2 -c 2 )x 2 , for every s e y e R» 

and condition (v) it is clear that the last two terms in equation (4) are 

bounded by a sup E{x(s) 2 } . Closer consideration of the decisive 
0<s<t 

term (T2) second on the right of (4) will yield the desired conclusion. 

Expanding the square 

f [ T g(x-s)g(T-y) [ [ E{h(s,x(s) ,y)h(y,x(y) ,z) n(dy)n(dz) ds dp 

Jo Jo J-<» J-» 

< f T f T g(T-s)g(T-y) f f tE{S 2 (s,y )}] 1/2 [E{h 2 (y,z)}] 1 / 2 n(dy)n(dz)dsdp 

== Jo Jo J-oo J-<© > 


1 7 Tf 2 (d 2 -C 2 )( f g(T-s) ds ) 2 sup E{x(s) 2 } 

1 Jo 0<S<T 


Hence, 


T2 < -I if 2 (d 2 -c 2 ) f |k(t-t)| ([ g(s)ds ) 2 sup E{x(s) 2 }d* 
“ 2 Jo Jo 0<8<T 
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And so, condition (vi) Implies that 

IT2 I < (l-o) sup E {x(s) 2 } 

an d that the combined operator composed of T2 and the sum of the last 

two terms is a contraction on the Banach space defined by the norm 

| |x | | - t sup E{x(s) 2 }]^ 2 . The conclusion of the Theorem follows easily 
sefi? 

from this point using familiar arguments from the earlier sections. 

QED 

While Theorem 6 may be regarded as a direct generalization of Theorem 
3.2.3 (nonlinear convolution versus a Wiener process), the comparatively 
more complicated conditions (i)-(vi) of Theorem 6 would seem to preclude 
the graphical interpertation possible for the conditions of the earlier 
theorem. No attempt will be made here to weaken Theorem 6 to permit such 
an, interpertation, though the promise of such a procedure is acknowledged. 

In order to complete the extension begun in this section it is neces- 
sary to prove the analog of Theorem 3.2.5 using Theorem 6 to prove asymp- 
totic invariance of the solution of equation (1) under appropriate 
assumptions on u, w, and v. While conceptually no more difficult, the 
statement and proof of the analog is technically more complex because of 
the nature of the solution sample paths of equation (1). Recall that 
the basic existence theorem for this situation (Theorem 1 here) adapted 
from [55] guarantees only that the solution trajectories will be piece- 
wise continuous. It is therefore necessary to discuss weak convergence of 
distributions on spaces of piecewise continuous functions. Recall that in 
section 2.4, it was rather easy to determine conditions for a set of 
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distributions on the space of continuous functions to be compact by using 
a modification of the Ascoli Theorem 116] to characterize compact sets of 
continuous functions and Prohorov’s Theorem (2.4.5) . 

Needed thus, are a topology on the set of piecewise continuous functions 
rendering them separable and complete (so that Theorem 2.4*2 will be neces- 
ary and sufficient in this case) and a characterization of the compact 
subsets in this topology. Combining the work of Skorokhod [56] , Billingsly 
[7], and Stone [58] the necessary framework is available. Rather than state 
this technical structure and then prove the theorem, the result will be 
stated, and the appropriate elements of the theory of weak convergence of 
measures on piecewise continuous functions used in the proof stated as 
lemmas . 

Theorem 7 : Consider the equation (1) under the assumptions 

(i) f and fc satisfy the sector conditions with the parameters (a,b) 
and (c,d) respectively. 

(ii) E{dw(t) } » 0 and E{[dw(t)] 2 } ■ a 2 dt. 

(iii) E{v(dt,dy)> - It(dy)dt and E{[v(dt,dy) - n(dy)dt] 2 } «* II(dy)dt, 

(iv) u, w, v are independent, u is piecewise continous (from the 

right) almost everywhere (P), and E{u(t)} ■ 0 ; E{u(t) JeL^R ). 

For s,t points of continuity (almost sure) of u and t e[s,t] 

E{|u(t) - u(t) | ^ 2 |u(t) - u(s)| 1/2 } < Y i t— s | 2 

(v) The kernel g e L 1 (R + ) flL^)* (iduch less restrictive conditions 
are possible here.) 

Then the criterion of Theorem 6 is sufficient to guarantee the asymptotic 
invariance of the solution process x. 
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Outline of Proof : 

Definition 8 : Let D(R + ;R) denote the apace of real-valued fuctions on R , 
which have a l imi t from the right and are continuous from the 16ft. 

Elements of D(R + ;R) are bounded on compact intervals, and for any 
e > 0 have at most a finite number of jumps of amplitude greater than e 
in any bounded interval [7]. 

Lemma 9 : [55, section 3.3] The existence Theorem 1 implies that x e F(ft;D), 
the set of D-valued random variables on ft, if u e F(ft;D). 

Lemma 10: [55] , [58] , [7 , p.115] A metric d Q exists on D(R + ;R) such that 

(D,d Q ) is a complete, separable metric space. 

This lemma assures that Theorem 2.4.2 applies in its full power on 
(D,d Q ). 

Lemma 11 : [7] For a subset J of D(R ;R) to be relatively compact (with 
respect to d Q ) it is necessary and sufficient that for every T e R + , and 
partition (t^Jj^of [0,T] 

sup sup |f(t)l<“, 
fej te[0,T] 

lim sup inf max {|f(t)-f(s)| ; t,s e tt.,t 1+1 )} » 0, 
<5+0 fej {t ± } 0<i<r 

where 6“ fcai {t.-t. .) is the size of the partition, 
i 

This result is the counterpart of the Ascoli Theorem defining compact 
sets of continuous functions. The necessary convergence criterion (compare 
Corollaries 2.4.7 and 2.4.8) is provided by: 

Lemma 12: A subset A c P(0-;D) is totally L-bounded if the following 
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conditions are satisfied for any sequence {x^} C A : 

(i) The sequence (x^CO)} is tight 

(ii) For 8$t continuity points of x q and any t e [s,t] 

E{jx n (t)“X n (T) | <X |x n (T)“X n (s)| a > < p|t-s|^ , I 

for a ^ 0, 8 > 1/2, and some p > 0, all independent of n. 

The proof of the Theorem 7 then proceeds to verify the inequality of 
Lemma 12 (ii) along the lines of the proof of Theorem 3.2.5, the particular 
values of a and 0 used are 2 and 1 respectively. The proof is, however, 
tedious and somewhat removed from the main focus of this work and will be 
omitted. 

In the next section the properties of the solutions of differential 
equations subject to totally L**bounded inputs is examined. Conditions on 
the cof f icients of the equations are found to guarantee that the solution 
is totally L-bounded when the driving function has this property. 

3.4 Differential Equations with Totally Bounded Inputs : 

In order to illuminate the results of the earlier sections of this 
chapter it is worthwhile to consider them in the usual setting provided by 
stochastic differential equations. This section consists of two distinct 
parts. First a general class of nonlinear functional differential equations 
is considered and conditions for L- total boundedness of the solution given. 
By assuming the functional coefficients in this equation to be memory less 
functions the solution becomesaa diffusion (strong Markov process), and the 
latter portion of this section contains a few remarks on this case. 

Following Fleming and Nisio [26] (see also Ito and Nislo [41]), 
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consider the functional stochastic differential equation 
(1) dx(t) o (a(ir t x))(t) du(t) + (b(w t x) (t)) dw(t) 

where a and b are continuous functionals on C(R~ ;R) (the R-valued con- 
tinuous functions on the negative real line R ", with the metric d introduced 
in section 3.2); w is a Wiener process on (ft, £?,P); u is a control to be 
specified later; and it is the truncation operator. An initial function 
x_ such that x(t) ** x_(t), t e R~, completes the specification of the 
equation. Assume the initial function x_ is an element of F(ft;C(R~;R)) . 

Let U r c C(R ;R) be the subset of the continuous functions satis- 
fying the Lipschitz condition below; for f e U r 

| f (t) - f(s)| <_ yI t— s | ; t,s e R + , f (0) ■ 0, 

for some constant y independent of f. Let U r have the relative topology 
Induced as a closed subset of (C(R ),d). Let S(ft;U r > be the set of Un- 
valued random variables (signals because the half-line R + is the time set) . 

Proposition 1 : (i) Let PM(U r ) have the Prohorov topology, then PM(U r ) is 
relatively compact, (ii) As a subset of F(ft;C), the C(R + )-valued random 
variables, S(ft;U r ) is totally L-bounded. 

Proof : (1) It is easy to verify that (.U ,d) is a compact (hence complete 
and separable) subset of (C(R ),d). Part (i) follows from this observation 
and Prohorov’s Theorem (2.4.5 here). Part (ii) is immediate from (i) or 
from Blllingsly’s result (Theorem 2.4.6). 

£ED 

Thus, the set of stochastic processes permitted as inputs is, in the 
terminology of section 2.4, totally L-bounded. The Lipschitz condition. 
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though severe, is not altogether uncommon in the literature dealing with 
stochastic control; see for instance, Fleming and Misio [26], and Fleming 
[24] for some remarks on this assumption. It is a natural constraint in 
the framework of the studies here. 

A few assumptions are in order on the coefficients in (1) and on the 
past condition x_. Assume 

(1) a,b are continuous on C(R + ),d) . 

(ii) For f e C(R + ) , t e R, 

t 0 

[a(f)(t)|+ |b(f)(t)| < |f(s)|dK(s) 

J mCO 

f 0 

for some measure dK, j dK(s) < » . 

J—VD 

(ill) E{x_(t)*} <, c, t <, 0, for some c<“> . 

(iv) l8 0t (u) v ^_ a<) (x_) v 50 Ot (dw) is independent of ^ t „(dw) 
for every t e R + . 

Theorem 2 : [26, p. 783] Under assumptions (i) through (iv) above, equation 
(1) has a unique solution x with locally bounded second moments, such that 
x e F(ft;C(R + )) and # Qt (x) G- $_oo()( x ..) v V ^0t^ dW ^ for 

every u e S(fl;U r > . 

Let S ■ {(x_,u,w)} the collection of triples such that x_ has the 
same probability law as x_ on C(R ) , u e S(ft;U r >, and w is a standard 
Wiener process. Following Fleming and Nisio [26], let s denote the 'generic 
element of E and let ¥ » (x j s e E) denote the set of solutions gen- 
erated by elements of H. 

Theorem 3; [26 ,p. 787] The set ¥ is a sequentially compact subset of 

(S(fl;C(R + )) ,L) where L denotes the Prohorov metric. 



Thus, for all admissible inputs, the state x is confined to a compact 
subset of the state space, in this case S(8;C(R + )). Hence, using the same 
techniques employed in section 3.2, for any element 8 of S, it is poss- 
ible to show that the distributions of x , the corresponding solution, are 

s 

convergent in the Cesaro sense used there to an invariant distribution on 
C(R + ). In Ito and Nislo [41] a rather more detailed treatment of equation 
(1) is presented for the case when du(t) » dt, corresponding in a sense 
to an autonomous system. 

Though giving the desired analogy to the results of sections 3.2 and 
3.3, Theorem 3 was used for quite a different purpose in [26]. Consider 
the problem of selecting a control u from SCftjU^) to minimize the func- 
tional E{$(x,u)} where * is some positive (values in R ), continuous 
functional on C(R + ) x ( (Joints ). 

Theorem 4 ; [26, p.792] Let c. S be closed under L-sequential limits, 
then there exists an element s^ e S such that E{$(x^,u^)} < E{$(x,u)} 
for any other s e Here x^ (x) is the solution of (1) corresponding to 
a x (s). 

As a theorem in stochastic control theory the above result has a proper 
place as a preliminary existence theorem; however, it suffers from being 
non-constructive and from requiring "total knowledge" of thes. ssatexx.T The 
existence problem in optimal stochastic control theory is in any case 
very difficult, and attempts to proceed beyond theorems of this nature have 
not been altogether successful. Some recent work holding the promise of a 
solution to the problem is contained in the papers of Benes [3], [4] and 
Duncan and Varlaya [15], and the ^comprehensive survey of Fleming [24]. 
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In the present context Theorem 4 serves to illustrate the earlier 
sections of this chaptef by providing an alternative application for the 
mathematical techniques involved. Note that in the equation (1), the 
solution x need not be a Markov process. The same observation holds in the 
work in the references [3], [4], [15]. If the solution x is a Markov process, 
the additional mathematical structure available has compelling consequences. 
In the remaining paragraphs of this section iome of the important aspects of 
this base will be summarized. As most of the analysis of stochastic systems 
has been done in this setting only a few of those results related to this 
research will be presented. 

Consider the stochastic equation (all elements are real valued) 
dx(t,w) - a(t,x(t,uj))dt + B(t,x(t,u>))dw(t) ; x(0 M ) * x 0 fa), t e R + ,we« • 

As usual this equation is but a shorthand for the integral equation 

(2) x(t)-x(s) - | a(s,x(s))ds + J b(s ,x(s) )dw(s) ; t,s e R+ • 

Here subject to the assumption of Lips chi tz continuity on the coefficient 
functions a and b , and the assumption that $ (x Q ) is independent of 
^O^Cdw) a solution of (2) may be shown to exist as an element of 

•A* 

S(ft;C(R )). Moreover, from the form of (2) it is easy to see that the 
solution x is a Markov process. In fact x is a strong Markov process 
(begins afresh at random times, see Ito [39] or McKean [50]), and so is 
a diffusion. 

AS a markov process the solution x is characterized by a transition 
operator foynkin [19, Chapter 3])P:R x R x R x £(R) -*• R. Here 
P(t,s,x Q ,E) expresses the probability that at time t £ R + , starring in 
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state x rt at time s e R + (s < t), the solution x(t) is an element of E e<6(R). 

u =** 

The following properties characterize P$ 

^ aL 

(i) For every (t,s,x) eR xR xR, P(t,s,x,*) is a probability 
measure on $ (R) . 

(ii) For every E e -£(R), P(t,s,x,E) is a measureable function of 
t,s,x jointly on the appropriate domain. 

(ill) For every (t,s,x,E) and r e [s,t] 

P(t,s ,x,E) » P(t,r,y,E) P(r,s ,x,dy) . 

•»R 


(iv) P(s.'«s,x,E) *■ lg(x) for every s e R + . 

Condition (ill) is the familiar Chapman- Kolmogorov equation for Markov 
processes [19], This key property of P defines a two-parameter family of 
operators on L^R), the bounded measureable functions mapping R into 
itself .according to the rule 


T. 

t,8 


L 


f(y) P(t,s,x,dy) 


For those fuctions f for which it exists ,t the limit 

T t s f “ f 

L - lim 

■ 1 - 8 


defines the operator 


3 2 f 


(A g f)(x) » a(s,x)~ (x) + j b 2 (s,x) 2 -^ (x) 

3x 

2 

whose domain JSf(A g ) includes at i&ast C Q (R;R), the space of functions 
R -► R, having compact support and two continoous derivatives. See for 
instance [19] for more details. 

On the set of probability measures on R (PM(R)> the transition 
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operator P defines a second two parameter-family of operators , for y e PM(R) 

(U y) (E) '» | P(t,s,x,E) y(dx) 

t » s Jr 

In a sense (which may be made precise [19]) U may be regarded as the 
adjoint of T. Observe that U defines the evolution of the probability dis- 
tribution of x, the solution of (2). That is, if y is the distribution 

s 

of the initial state x(s), then U y is that of x(t), t > s, on #(R). 

C |8 8 

If P as a measure on $ (R) has a "derivative" (Radon-Nikodym [30]) 
with respect to Lebseque measure dy, then denoting this function by p: 

P(t,s,x,E) » ( p(t,s,x,y) dy 

Jr 

Moreover, the function p (which may be a generalized function if need be) 
satisfies the equations, for 0<8<t 

2 

(3a) A fl p » a(s,x) ||(t,s,x,y) + ~ b 2 (s,x) -^-|(t,s,x,y) - - |£(t,s,x,y), 

ax 

(3b) A*o - - + I 3 2 ^ 2 l t| y)p(t > s,x >y ) ] 3p(t,8,x,y) 

V 3y * 2 3y 2 3t 

Here A is the "generator" of T and A* is formally its adjoint. Of 

,8 u | 8 C 

course (3a) and (3b) are the well-known Kolmogorov backward and forward 
equations. The latter is also frequently called the Fokker-Planck equation. 
For the (3b) the fundamental solution is generated by the initial condition 
(6 - the Dirac function) p(s,s,x,y) ® 6(x-y). And for (3a) p(t,t,x,y) ■ 
lp(x) defines P(t,s,x,r) for 0 < s <_ t. 

Note that (3b) makes little sense unless theccoefficients a and b are 
sufficiently smooth. Equation (3a) has the obvious advantage that it applies 
even if the coefficients are not well-behaved. Moreover, it is known [19] 
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* 2 

that if a and b are bounded and Lipschitz (Holder) continuous , and b is 
everywhere positive-definite, then (3a) has a smooth, unique, fundamental 
solution. This solution precisely defines the distribution of the process 
x corresponding to A , starting from any initial distribution, according 
to 

(U -WHE) ■ [ p(t,s ,x,y) y(dy) 

c » 8 J R 

where y is the distribution (on ^(R)) of x(s). 

A modification of this concept yields a means of solving arbitrary 
equations of the form 

(4) A fi u ■ , u(s,x) - f(x). 

That is, since E„ f(x(s)) » p(t,s,x,y) f(y) dy (E £ is the 

expectation of £ conditioned on x(t) ■ x), then clearly u(t,x) •» E. f(x(s)) 

i , 

"solves" (4). See [19, Chapter 13] for more details. Taking into account 
the interpertatlons afforded by the stochastic differential equation for 
x, this solution method is more than a tautology. 

The problem corresponding to the analysis of the past three sections 
in this setting is to study the behavior of the function p(t,s,x,y) as a 
solution of (3a) as t-s approaches infinity. In other t had- specific 
instances this analysis uses certain auxiliary functions with properties 
similar to Lyapunov functions. For the case of time-varying coefficients 
(a and b) under consideration here the best result is due to II' in and 
Khas'mlnskii [38]: 


ft 

See [59] for an analysis of equations with less restricted coefficients. 
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Theorem 5 ; [38, p.248] Let p(t,s,x,y) be the fundamental solution of (3a) 
for t £ s. Let V(t,r) be a positive function, monotonically decreasing 
with respect to t, and nondecreasing with respect to r ._> 0, such that for 
all s 

(I) a(t-s,x) + i b 2 (t-s,x) < 0 

3x 

(II) V(0,r) > 1 ; r > 0. 

(III) f V(t,r) dt < ® ; r> 0. 

Jr+ 

Then for every measureable function f, J p(t,s,x,y) f(y) dy -► a * 

Jr 

as t-s -*■ oo, where a is a constant and o > 0 if f(x) j> 0. 

Proof : Put 

u(t,x) - p(t,s,x,y) f (y) dy 

Jr 

in Theorem 3 of [38] and the eesult follows. 

Corollary 6 : [38, p. 255] Let a(t,x) be bounded for all x e R» C s, and 
a(t,x) + xb(t,x) < -B < 0 , then the conclusion of Theorem 5 holds. 

If the coefficients are time-invariant in (2), that Is, 

(5) dx(t) - a(x(t)) dt + b(x(t))dw(t) 

then the Markov process x may be described by a transition operator 
P:R + x R x ^B(R) -► R + . In this case P(t,x,E) gives the probability that 
x(t) e E given that x(0) - x. The sets of operators { T t } teR + and ^ u t ^ te R+ 

(T f)(x) » [ f (y)P(t ,x,dy) 

Jr 

(UJ) (E) » ( P(t,x,E)y(dx) 

Jr 


as a 


are in this case semigroups, o T g ® T fc+a and U t q U g ® U fc+g , 

consequence of the Chapman-Komogorov relations. The infinitesimal generator 
A of T is defined as the limit 

If - f 

L - lim - ■ Af . 

CO t 

t+s c 

2 

Here Af «* a(x) ~ + y b 2 (x) , and the equations (3a) and (3b) 

3x 

for the density function p of P are 

C6a) . aM + 1 b 2 M 

3t ax L ■ tof . 

( 6b ) 3p.(t,«,y), „ _ 3[a(y)p(t,x,y)] + 1 9 2 [b 2 (y)p(t,x,y) ) 

3t 3y 2 -.2 

oy 

Or concisely, 

(6a)' 3p/3t » Ap , p(0,x,y) » 6(x-y) 

(6b)' 3p/3t - A*p , p(0,x,y) » IgU) • 

The problem corresponding to Theorem 5 above is to establish the 
existence of an invariant distribution for x. Such a distribution is an 
element y of PM(R) such that y « U fc y for every t >_ 0. It is an equivalent 
problem to look for solutions to A*u ■ 0. For let u be the density of the 
Invariant measure y with respect to Lebesque measure, y(E) » / £ u(x) dx , 
and let p(t,x,y) be the density of ?(t,x,E). Then again the definition of 
an invariant distribution is y(E) (U^yHE) or 

u(z)dz ® [ P(t,x,E) y(dx) 

'e Jr 
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» | [I p(t,x,y) u(x) dx ] dy 

ve Jr 

or since E e $(R) was arbitrary 

i 

u(y) = p(t,x,y) u(x) dx . 

Jr 

Assuming the right-hand side to be twice differentiable in x and once in t 
under the integral sign, and assuming p as a function of t and y satisfies 
A*p = 3p/3t, then 

(3/3t - A*)u(y) « [ (3/3t - A*)p(t,x,y) u(x) dy 

Jr 

“ 0 . 

And so, A*u ■ 0 justifying the claim. 

Before considering the invariant measure problem from this point of 
view, it is appropriate to return to the transition operator and examine it 
more closely. The next paragraphs follow Khas'minskii [43]. Assume the 
following: 

(1) The process x as a solution of (5) has continuous sample paths. 

(ii) The operators T t : C(R) C(R), or that x is a Feller Process [19]. 
(ill) The process x is non-degenerate , or equivalently, P(t,x,U) >0 
holds for any open set of positive Lebesque measure. 

(iv) The process x is a strong Markov process. 

(v) The process z is recurrent ; i.e., there exists a compact subset 
K of R such that for every x e R, P(t,x,K) ■ 1 for some t e R + . 

Proposition 7 : [43 ,p. 180] The trajectories of the process x are every- 
dense in R. 
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The relevant result derived from these assumptions Is given In the 

Theorem 8 ; [43, p. 182] For the recurrent, diffusion process x as a solu- 
tion of (5) there exists a non-trivlal, unique a-finite invariant measure 
y. If y(R)< “, then 



P(t,x,E) dt 


y(E)/y(R) . 


Compare the second assertion of this theorem with the arguments in the 
proof of Theorem 3.2.5. See Doob [14] for related remarks on this convergence 
Using Doob [14, Theorem 5], Khaa'minskll is actually able to conclude that 
if y(R) is finite, then P(t,x,E) y(E) for every x e Ri Finiteness of 
y may be shown under minor additional restrictions on the process x. 

Returning to the density equations, the precise conditions for x to 
have an invariant measure are given in 


Theorem 9 ; [43,p.l90] In order that x have a finite invariant measure, 
it is necessary and sufficient that Au » -1 have a positive solution in 
R>D for some bounded domain D with smooth boundary 3D. Moreover, in this 
case, for any measureable function f 


lim [ p(t,s,y) f (y) dy - [ f(y) y(dy) 

t-*» j r Jr 

where y is the invariant measure, and p the fundamental solution of 
A*p « 3p/3t. 

Proved by arguments involving the first entrance times into the domain 
D, Theorem 9 depends critically on the smoothness properties of 3D. This 
is of course a significant condition and in most instances a handicap. 
Baaed on the paper [43], Wonham’s paper [71] contains some important 
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sufficient conditions guaranteeing the hypothesis of Theorem 9, and thus 
recurrence and invariance of the solution process x. His conditions use 
Lyapunov functionals of the state* 

Let denote the open ball of radius r> 0 in Euclidean space and 
assume that the function v on R satisfies the following 
(1) v is twice continuously differentiable. 

(ii) v(x) > 0 for x e S^, v(x) -*■ » as x -*■ «> , 

Theorem 10 : [71, p. 200] If there exists a function v, satisfying (i) and 
(ii) above and such that Av < -1, the x, the solution of (5), has a 
unique invariant distribution. 


Although the analysis of Wonham and Khas'minskli relies almost exclus- 
ively on the analytical structure of Markov processes. It is more illuminat- 
ing to outline the proofs of Theorems 9 and 10 in the framework used earlier 
in this chapter. The idea is simple: from any initial distribution y , 
the distributions of x(t) for t e R evolve according to U t y Q » y fc . 


where U fc is the semigroup defined above: 

y t (E) - U t y o (E) » | P(t,x,E) y o (dx) . 


Clearly, is linear and continuous on PM(R) with the topology of weak 
convergence; continuity following from the Feller property. Thus, on a 
compact set contained in PM(R), U £ is closed and has a fixed point [16, p.456] . 
Thus, it remains to show that the distributions of x form a compact sub- 
set of PM(R). It is at thidspoint that the Lyapunov functional is used, 
see Elliot [22, section 4.3]. 

Let v be a functional on R satisfying the assumptions (i) and (11) 
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above; further assume that v(0) ■ 0, ThenElllot [ 22 , p.39] shows that for 
y > 0 the set 

$(v;y) » {y e PM(R) ! / v(x) y(dx) _< y} 

is compact ( in the weak topology on PM(R)). Elliot's resilti is ^«he fal- 
lowing : 


Theorem ll' i [ 22 , p. 35] Let v satisfying (i) and (ii) above be such that 
for positive c 1 ,c 2 ,c 3 


(i) ’ | (Av) (x) | < Cl (l + |x| 2 ) 

(ii) ' (Av) (x) < c 2 - c 3 v(x) 


; x k 


then there exists an invariant distribution for x. 
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It is appropriate to remark that this technique of proving the exist- 
ence of distributions invariant under time-shifts is commonly used in the 
ergodic theory of Markov processes per se. See for example Foguel [27] 
for an Interesting Introduction to this subject. While less constructive 
than the use of the steady state Fokker-Planck equation, the technique is 
quite similar to that used in the earlier sections of this chapter. 
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CHAPTER 4 

APPLICATIONS, CONCLUSIONS, AND FURTHER RESEARCH 

4.1 A Few Remarks on Applications s 

In this section two feedback systems profitably modelled as random 
will be considered. The purpose here is not to give a complete investi- 
gation of these examples but rather to Indicate treatments within the 
framework established in the last chapters. 

A. The human operator ; 

As a first example consider the human as a feedback controller. Feed- 
back systems containing humans arise naturally in many settings' [2] , 
perhaps the most familiar one in an engineering context is as a pilot. 

In the design of control mechanisms and instrument displays for aircraft 
it is important to have some model of the pilot as the "actuator link" 
between the instruments and the control mechanism. Because of the highly 
individual techniques of pilots [45] and the possibility of a large number 
of pilots flying any particular aircraft, it is appropriate to model the 
human as containing some random parameters when operating in this situation. 

In controlling an aircraft about some nominal trajectroy, the human may 
be modelled as an essentially linear element subject to random perturba- 
tions in the following manner. In reading the instruments errors are made, 
and these errors, being characteristic of individuals, are usually modelled 
as the effect of additive noise. Attempting to deduce the state of the 
aircraft from these imperfect observations, the human performs a kind of 
filtering operation in some optimal manner. Thtd step is usually modelled 
as operating on the noisy observation signal with an optimal linear (Kalman) 
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filter. The next step in the control process Is operating the control 
mechanism so as to correct for any perceived errors from the nominal 
trajectory. At this stage a delay is introduced as a consequence of the 
neuro-motor delays of the human. Moreover, noise is usually added here to 
account for the errors in manipulating the controls. This model of the 
human controller in a steady-state control task reduces to the cascade 
of elements shown in Figure 1. 



Defining the Kalman filter by its impulse response k, the input-output 
equation of the model is 


r f 

J k Q m(t) + k Q 1 k(t-A to -s) [x(s-A Q ) + o(s-A o >] ds ; 

" f *° t > a +a 

— mo 

^km(t) ; t < A b +A d . 

For any model of the aircraft*;! (about the nominal operating point) anal- 
ysis of feedback systems including the human operator model above is 
straight-forward from this point (except for the presence of delays) by 
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familiar methods. 

Frequently, however, a less detailed model of human as a white 
noise gain Is used to obtain worst case results in experiments involving 
a wide range of operating conditions [2]. In this case the model of 
Figure 2 applies. 



K(S) 


linear element 


y 


N(t) 

(multiplicative 
white noise) 

Figure 2 : A crude model of the human operator. 

Here K represents the combined effects of the human's filtering action 
and (Pade) approximations to the delays. Thus, 


y(t) 



k(t-s) x(s) dN(s) 


as an Ito integral, describes the transfer of observation (x) into control 
action (y) by the human. Here k is the impulse response of the linear 
element K. Again for an appropriate linear model of the aircraft in steady 
state operation, analysis of the' human as a controller is straight-forward 
using results like Theorems 3.2.4, 3.3.4, and 3.3.5. The latter give easy 
sufficient conditions in terms of the frequency response of the linear 
elements for boundedness of the signals in the control loop. 




112 


In the event that the human model includes a nonlinearity satisfying 
sector conditions as used in Chapter 3, perhaps reflecting thresholds of 
no response [45], then analysis using Theorem 3.2,5, etc., is no more 
difficult than in the linear case. 


B. Analysis of round-off errors in numerical computations : 

In the first chapter the point was raised that the accumulation of 
round-off errors in a numerical computation aould be considered as a 
stochastic process. Though in actuality a deterministic phenomena, the 
randomization of the error evolution is warranted by the extreme com- 
plexity of any nontrivial computation on a large machine. The development 


of a statistical model takes the following fotrfce (this analysis is drawn 
from Henrici [34], [35]). 

Most numerical algorithms consist of generating a sequence of numbers 

x ,x, ,..., defined by the relations 
o’ 1* 


x_ » F (x . ...,x .) ; n «* 1,2,... 

n no n-J. 

In actual machine computations, however, the algorithm is only approximately 

realized and machine numbers x (of finite length) are generated by the 

n 

approximate realizations F q by 

x ■ F (x ,...,x .) ; n » 1,2,... 

n no n— l 


Write 


x n " F n (x o”** sX n-l ) + e n 

and consider this as the definition of the local rounding error e n - Thus, 



^n^ x o’* * * * x n-l^ “ F n^ x o** ** ,x n-l^ 
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Each local rounding error is propagated through the remainder of the com- 
putation (from n on), and in this process its effect on the final accum- 
ulated error may be amplified or diminished. The Accumulated rounding error 
r^ at any stage is defined as the difference between the numerical result 

and the correct theoretical result; here f «* 2 - x , 

n n n 

Clearly, knowledge of the machine approximations F q would permit one 
to determine worst case bounds for the error evolution under the unlikely 
hypothesis that each local rounding error has the maximum bad effect on the 
accumulated error. Such a systematic reinforcement of errors is unlikely 
in any typical computation, and the bounds obtained under this assumption 
are usually uninformative. It is the need to have some appraisal of the 
"average" growth of round-off errors that motivates the statistical 
assumptions. 

Therefore, assume that each e is for each n a random variable on some 

n 

probability space (Q,^,P) . The accumulated error evolves according to 


r ° e + F (2 ,...,2 .) + F (2 ,...,2 .) 

n n no n— 1 n o n-± 


» e + H (r , . . ,r , ) 
n n o' * n-1 

The stability problem becomes the following: given the statistics of the 
stochastic process { e n ) neZ -i» describe those of the process { r n ) ne g+ 


as 


n ■+ oo. of particular Interest are bounds on theiiitaattn-AtidvyAriAAde 6f the 
process [r] , as these are easily determined and indicate the average 
rate of growth of the errors. The general conditions of section 3.1 enable 
one to constrain the operator H so as to assure compactness of the dis- 
tributions of {^ n r ) ne 2 + (th® truncations of r) on some sequence space 
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and guarantee asymptotic Invariance (with n) of the statistics of r if e 
is stationary. Moreover, the moment bounds determine the asymptotic limit 
distribution approximately. In certain linear integration schemes (the 
operator H becomes a linear convolution) the results of section 3.2 apply 
immediately. In relation to this point see [66], inhere the analog of Theorem 
3.2.4 is given for random sequences. 

4.2 Conclusions and Suggestions for Future Research : 

In order to place the present work in perspective 1 if' is’ necessary t-to 
place the study of stochastic systems within the theory of dynamical sys- 
tems. Although it is too early for the latter task, some points are clear. 
First the study of dynamical systems has proven to be one of the most 
fruitful branches of engineering and mathematics, and for this reason any 
extensions and generalizations should be pursuited for additional insight. 
The admission of stochastic variables in optimization problems has led to 
a much better understanding of the role of information patterns in control 
systems as may be judged from the several papers on this subject in the 
Bibliography. Secondly the application of stochastic systems as models 
for complex physical systems would seem to be promising; the demonstrated 
success of a few definitive case studies would strengthen this assertion. 

Of a more technical nature is the observation that the properties of 
causal, dynamical systems ate deeply related to those of Markov processes. 

A general examination of the relationship between causality and the Markov 
property beyond the obvious would seem valuable. Certainly the description 
of systems by stochastic differential equations interperted in the analy- 
tical theory of Markov process has provided a rich class of systems 
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described by partial differential equations • Viewed from the field of 
what has come to be called distributed parameter systems, this aspect of 
stochastic systems permits an easy interpertation of the properties of 
the distribotod solution as a probability density function. Moreover, the 
additional interpertation provided by the differential equation for the 
sample trajectories of the process cannot be but an asset in the analysis 
of the partial differential equation. This relationship between distributed 
systems and Markovian systems is largely unexplolted as such. 

As the remarks above reflect some of the tentative ^aspects of the 
stochastic systems theory, so must the present work be regarded as pre- 
liminary in nature. For as an investigation of the problem of determining 
the transformations of probability distributions by dynamical (feedback) 
systems, its provisional aspects are apparent. Perhaps the most signifi” 
cant drawback is the non-cons tructive nature of the analysis. It would be 
an important extension of this work to render the process of analysis 
constructive, though this is likely to be equivalent to solving the implicit 
feedback equations and hence impossible in general. 

However, as an alternative approach to the analysis of the asymptotic 
properties of stochastic systems, this work has succeeded in making the 
Prohorov theory directly applicable to this kind of analysis. In this con- 
text the work is antedated by that of Ito and Nisior [41] and Fleming and 
Nlsio [26], though the explicit connection of deterministic operator 
stability theory and the Prohorov theory, using the results of Topsjie, 
appears to be novel. Finally, the specific results of sections 3.2 and 3.3 
are interesting as generalizations of deterministic counterparts-the 
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Nyqulst and Circle Theorems. Examined along with papers! like [21J , [47], 

[73], [46], and [66], these theorems should increase the understanding of 
stochastic systems containing linear elements. 

Further comments on this work may be usefully made by suggesting a 
few extensions and modifications. In addition to the general statements 
above, consider then the following precise problems. 

A. Stability conditions based on empirical distributions : 

Of course one of the primary objections to this work is its a priori 
assumption of given distributions for the perturbation inputs and random 
parameters. In any practical experiment these are seldom given and usually 
difficult to determine experimentally, though appropriate statistical 
methods are available. About the most complete characteriaatlon one could 
reasonably hope for is a number of empirical distributions for the uncer- 
tainties derived from samples of the processes. It would be, therefore, very 
useful to determine conditions based on empirical distributions of the 
inputs and outputs that assure the asymptotic regularity of the outputs 
in the sense used previously. These conditions would have to apply for a 
class of distributions which could give rise to those observed empirically. 
The Prohorov theory has potential applications here especially on the space 
D of piecewise continuous functions, see some comments to this effect in [7]. 
The definitions of stochastic systems given in section 3.1 are designed to 
permit a number of possible distributions for the uncertainties present, 
and may prove useful in the early stages of work on this problem. 

B. Stochastic systems with nonlinear state spaces : 

Consider the problem of designing a feedback control law to accurately 
orient a rigid body (satellite) in orbit. The perturbations are essentially 
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stochastic in nature, arising^position sensor errors and natural pheno- 
mena. As is well known the attitude of frigid body in a fixed coordinate 
system is described by a set of 3 x 3 orthogonal matrices , a set hot 
closed under addition. Hence, the control problem must be analyzed in a 
setting where the state space of the system is a nonlinear manifold. 

Note that the analysis of round-off errors may be considered in this 
framework, as the local errors are confined to a fixed Interval and may 
be considered as random variables on a circle, see [23,p.61]. 

One of the reasons for seeking problems with nonlinear state spaces 
is the good possibility of obtaining explicit analytical solutions to 
the diffusion equations (for the probability density functions, of the state). 
There are rather few diffusion equations, aside from the Guass-Markov case, 
that admit an explicit solution in the usual vector space setting. For 
certain special manifolds explicit solutions to Laplace's equation are 
well known and may be used to describe Brownian motions on these manifolds 
[18], Other references are Elliot [22], McKean [50] and the feferences 
therein. Research on this problem should provide interesting enhancements 
of the work in Brockett [9]. 

C. Passive stochastic systems : 

Of a rather more technical nature is the problem of describing the 
analog of passivity in a stochastic setting. Recall that a deterministic 
operator G on the Hilbert space (H,<«,'> ) is said to be passive if 

Re<x,Gx> > 0 for every x e H. 

This is equivalent to the physical notion of a system which always diss- 
pates energy? (64]. 
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Let L 2 (R + ) denote the space of square integrable, real-valued functions 
on R + , and F(ft:L 2 ) be the set of L 2 «valued random variables. Then clearly 
the set (^(8;L 2 ) ,<•»•>) , where for x,y 6 «C(Q;L 2 ) c . F(ft;L 2 > 

<x,y> «* E {( x(t,w)y(t,w) dt}*" 1 - 

>0 

and <x,x> < 00 for every x, is a Hilbert space. Moreover, the inequality 

<x,Gx> j> 0 makes perfect mathematical sense for some (random) endomorphism 
G on <9C(fl;L 2 > , and it is easy to give theTBfitive Operator stability theorem 
[74, p.235] of the deterministic theory in this setting. Physical inter- 
pertatlons of the result are less easy, however, and apparently some 
notion of random spectra must be developed. Useful ideas are likely to be 
found in statistical mechanics [62]. 
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